Statistical Mechanics
Franz Schwabl
Statistical Mechanics Translated by William Brewer
Second Edition With 202 Figur...

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Statistical Mechanics

Franz Schwabl

Statistical Mechanics Translated by William Brewer

Second Edition With 202 Figures, 26 Tables, and 195 Problems

123

Professor Dr. Franz Schwabl Physik-Department Technische Universit¨at M¨ unchen James-Franck-Strasse 85747 Garching, Germany E-mail: [email protected]

Translator: Professor William Brewer, PhD Fachbereich Physik Freie Universität Berlin Arnimallee 14 14195 Berlin, Germany E-mail: [email protected]

Title of the original German edition: Statistische Mechanik (Springer-Lehrbuch) 3rd ed. ISBN 3-540-31095-9 © Springer-Verlag Berlin Heidelberg 2006

Library of Congress Control Number: 2006925304

ISBN-10 3-540-32343-0 2nd ed. Springer Berlin Heidelberg New York ISBN-13 978-3-540-32343-3 2nd ed. Springer Berlin Heidelberg New York ISBN 3-540-43163-2 1st ed. Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable for prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2002, 2006 Printed in Germany The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: W. Brewer and LE-TEX Jelonek, Schmidt & Vöckler GbR using a Springer TEX-macro package Production: LE-TEX Jelonek, Schmidt & Vöckler GbR, Leipzig Cover design: eStudio Calamar S. L., F. Steinen-Broo, Pau/Girona, Spain Printed on acid-free paper

56/3100/YL

543210

A theory is all the more impressive the simpler its premises, the greater the variety of phenomena it describes, and the broader its area of application. This is the reason for the profound impression made on me by classical thermodynamics. It is the only general physical theory of which I am convinced that, within its regime of applicability, it will never be overturned (this is for the special attention of the skeptics in principle). Albert Einstein

To my daughter Birgitta

Preface to the Second Edition

In this new edition, supplements, additional explanations and cross references have been added in numerous places, including additional problems and revised formulations of the problems. Figures have been redrawn and the layout improved. In all these additions I have pursued the goal of not changing the compact character of the book. I wish to thank Prof. W. Brewer for integrating these changes into his competent translation of the ﬁrst edition. I am grateful to all the colleagues and students who have made suggestions to improve the book as well as to the publisher, Dr. Thorsten Schneider and Mrs. J. Lenz for their excellent cooperation.

Munich, December 2005

F. Schwabl

Preface to the First Edition

This book deals with statistical mechanics. Its goal is to give a deductive presentation of the statistical mechanics of equilibrium systems based on a single hypothesis – the form of the microcanonical density matrix – as well as to treat the most important aspects of non-equilibrium phenomena. Beyond the fundamentals, the attempt is made here to demonstrate the breadth and variety of the applications of statistical mechanics. Modern areas such as renormalization group theory, percolation, stochastic equations of motion and their applications in critical dynamics are treated. A compact presentation was preferred wherever possible; it however requires no additional aids except for a knowledge of quantum mechanics. The material is made as understandable as possible by the inclusion of all the mathematical steps and a complete and detailed presentation of all intermediate calculations. At the end of each chapter, a series of problems is provided. Subsections which can be skipped over in a ﬁrst reading are marked with an asterisk; subsidiary calculations and remarks which are not essential for comprehension of the material are shown in small print. Where it seems helpful, literature citations are given; these are by no means complete, but should be seen as an incentive to further reading. A list of relevant textbooks is given at the end of each of the more advanced chapters. In the ﬁrst chapter, the fundamental concepts of probability theory and the properties of distribution functions and density matrices are presented. In Chapter 2, the microcanonical ensemble and, building upon it, basic quantities such as entropy, pressure and temperature are introduced. Following this, the density matrices for the canonical and the grand canonical ensemble are derived. The third chapter is devoted to thermodynamics. Here, the usual material (thermodynamic potentials, the laws of thermodynamics, cyclic processes, etc.) are treated, with special attention given to the theory of phase transitions, to mixtures and to border areas related to physical chemistry. Chapter 4 deals with the statistical mechanics of ideal quantum systems, including the Bose–Einstein condensation, the radiation ﬁeld, and superﬂuids. In Chapter 5, real gases and liquids are treated (internal degrees of freedom, the van der Waals equation, mixtures). Chapter 6 is devoted to the subject of magnetism, including magnetic phase transitions. Furthermore, related phenomena such as the elasticity of rubber are presented. Chapter 7

X

Preface

deals with the theory of phase transitions and critical phenomena; following a general overview, the fundamentals of renormalization group theory are given. In addition, the Ginzburg–Landau theory is introduced, and percolation is discussed (as a topic related to critical phenomena). The remaining three chapters deal with non-equilibrium processes: Brownian motion, the Langevin and Fokker–Planck equations and their applications as well as the theory of the Boltzmann equation and from it, the H-Theorem and hydrodynamic equations. In the ﬁnal chapter, dealing with the topic of irreversiblility, fundamental considerations of how it occurs and of the transition to equilibrium are developed. In appendices, among other topics the Third Law and a derivation of the classical distribution function starting from quantum statistics are presented, along with the microscopic derivation of the hydrodynamic equations. The book is recommended for students of physics and related areas from the 5th or 6th semester on. Parts of it may also be of use to teachers. It is suggested that students at ﬁrst skip over the sections marked with asterisks or shown in small print, and thereby concentrate their attention on the essential core material. This book evolved out of lecture courses given numerous times by the author at the Johannes Kepler Universit¨ at in Linz (Austria) and at the Technische Universit¨at in Munich (Germany). Many coworkers have contributed to the production and correction of the manuscript: I. Wefers, E. J¨ org-M¨ uller, M. Hummel, A. Vilfan, J. Wilhelm, K. Schenk, S. Clar, P. Maier, B. Kaufmann, M. Bulenda, H. Schinz, and A. Wonhas. W. Gasser read the whole manuscript several times and made suggestions for corrections. Advice and suggestions from my former coworkers E. Frey and U. C. T¨auber were likewise quite valuable. I wish to thank Prof. W. D. Brewer for his faithful translation of the text. I would like to express my sincere gratitude to all of them, along with those of my other associates who oﬀered valuable assistance, as well as to Dr. H. J. K¨ olsch, representing the Springer-Verlag.

Munich, October 2002

F. Schwabl

Table of Contents

1.

2.

Basic Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 A Brief Excursion into Probability Theory . . . . . . . . . . . . . . . . . 1.2.1 Probability Density and Characteristic Functions . . . . . 1.2.2 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . 1.3 Ensembles in Classical Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Phase Space and Distribution Functions . . . . . . . . . . . . . 1.3.2 The Liouville Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Quantum Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 The Density Matrix for Pure and Mixed Ensembles . . . 1.4.2 The Von Neumann Equation . . . . . . . . . . . . . . . . . . . . . . . ∗ 1.5 Additional Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 1.5.1 The Binomial and the Poisson Distributions . . . . . . . . . ∗ 1.5.2 Mixed Ensembles and the Density Matrix of Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Equilibrium Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Microcanonical Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Microcanonical Distribution Functions and Density Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 The Classical Ideal Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 2.2.3 Quantum-mechanical Harmonic Oscillators and Spin Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 General Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 An Extremal Property of the Entropy . . . . . . . . . . . . . . . 2.3.3 Entropy of the Microcanonical Ensemble . . . . . . . . . . . . 2.4 Temperature and Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Systems in Contact: the Energy Distribution Function, Deﬁnition of the Temperature . . . . . . . . . . . . . . . . . . . . . .

1 1 4 4 7 9 9 11 14 14 15 16 16 19 21

25 25 26 26 30 33 35 35 36 37 38 38

XII

Table of Contents

2.4.2 On the Widths of the Distribution Functions of Macroscopic Quantities . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 External Parameters: Pressure . . . . . . . . . . . . . . . . . . . . . 2.5 Properties of Some Non-interacting Systems . . . . . . . . . . . . . . . 2.5.1 The Ideal Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 2.5.2 Non-interacting Quantum Mechanical Harmonic Oscillators and Spins . . . . . . . . . . . . . . . . . . . . 2.6 The Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 The Density Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Examples: the Maxwell Distribution and the Barometric Pressure Formula . . . . . . . . . . . . . . . 2.6.3 The Entropy of the Canonical Ensemble and Its Extremal Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.4 The Virial Theorem and the Equipartition Theorem . . 2.6.5 Thermodynamic Quantities in the Canonical Ensemble 2.6.6 Additional Properties of the Entropy . . . . . . . . . . . . . . . 2.7 The Grand Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Systems with Particle Exchange . . . . . . . . . . . . . . . . . . . . 2.7.2 The Grand Canonical Density Matrix . . . . . . . . . . . . . . . 2.7.3 Thermodynamic Quantities . . . . . . . . . . . . . . . . . . . . . . . . 2.7.4 The Grand Partition Function for the Classical Ideal Gas . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 2.7.5 The Grand Canonical Density Matrix in Second Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.

Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Potentials and Laws of Equilibrium Thermodynamics . . . . . . . 3.1.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 The Legendre Transformation . . . . . . . . . . . . . . . . . . . . . . 3.1.3 The Gibbs–Duhem Relation in Homogeneous Systems . 3.2 Derivatives of Thermodynamic Quantities . . . . . . . . . . . . . . . . . 3.2.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Integrability and the Maxwell Relations . . . . . . . . . . . . . 3.2.3 Jacobians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Fluctuations and Thermodynamic Inequalities . . . . . . . . . . . . . 3.3.1 Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Absolute Temperature and Empirical Temperatures . . . . . . . . . 3.5 Thermodynamic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Thermodynamic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 The Irreversible Expansion of a Gas; the Gay-Lussac Experiment . . . . . . . . . . . . . . . . . . . . . . . . 3.5.3 The Statistical Foundation of Irreversibility . . . . . . . . . .

41 42 46 46 48 50 50 53 54 54 58 60 63 63 64 65 67 69 70 75 75 75 79 81 82 82 84 87 88 89 89 90 91 92 93 95 97

Table of Contents

3.5.4 Reversible Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.5 The Adiabatic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 The First and Second Laws of Thermodynamics . . . . . . . . . . . . 3.6.1 The First and the Second Law for Reversible and Irreversible Processes . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 3.6.2 Historical Formulations of the Laws of Thermodynamics and other Remarks . . 3.6.3 Examples and Supplements to the Second Law . . . . . . . 3.6.4 Extremal Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 3.6.5 Thermodynamic Inequalities Derived from Maximization of the Entropy . . . . . . . . . . 3.7 Cyclic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 General Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.2 The Carnot Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.3 General Cyclic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Phases of Single-Component Systems . . . . . . . . . . . . . . . . . . . . . 3.8.1 Phase-Boundary Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.2 The Clausius–Clapeyron Equation . . . . . . . . . . . . . . . . . . 3.8.3 The Convexity of the Free Energy and the Concavity of the Free Enthalpy (Gibbs’ Free Energy) . . . . . . . . . . . 3.8.4 The Triple Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 Equilibrium in Multicomponent Systems . . . . . . . . . . . . . . . . . . 3.9.1 Generalization of the Thermodynamic Potentials . . . . . 3.9.2 Gibbs’ Phase Rule and Phase Equilibrium . . . . . . . . . . . 3.9.3 Chemical Reactions, Thermodynamic Equilibrium and the Law of Mass Action . . . . . . . . . . . . . . . . . . . . . . . ∗ 3.9.4 Vapor-pressure Increase by Other Gases and by Surface Tension . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.

Ideal Quantum Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The Grand Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The Classical Limit z = eµ/kT 1 . . . . . . . . . . . . . . . . . . . . . . . . 4.3 The Nearly-degenerate Ideal Fermi Gas . . . . . . . . . . . . . . . . . . . 4.3.1 Ground State, T = 0 (Degeneracy) . . . . . . . . . . . . . . . . . 4.3.2 The Limit of Complete Degeneracy . . . . . . . . . . . . . . . . . ∗ 4.3.3 Real Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 The Bose–Einstein Condensation . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 The Photon Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Properties of Photons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 The Canonical Partition Function . . . . . . . . . . . . . . . . . . 4.5.3 Planck’s Radiation Law . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 4.5.4 Supplemental Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 4.5.5 Fluctuations in the Particle Number of Fermions and Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

XIII

98 102 103 103 107 109 120 123 125 125 126 128 130 130 134 139 141 144 144 146 150 156 160 169 169 175 176 177 178 185 190 197 197 199 200 204 205

XIV

Table of Contents

4.6 Phonons in Solids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 The Harmonic Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Thermodynamic Properties . . . . . . . . . . . . . . . . . . . . . . . . ∗ 4.6.3 Anharmonic Eﬀects, the Mie–Gr¨ uneisen Equation of State . . . . . . . . . . . . . . . 4.7 Phonons und Rotons in He II . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 The Excitations (Quasiparticles) of He II . . . . . . . . . . . . 4.7.2 Thermal Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 4.7.3 Superﬂuidity and the Two-Fluid Model . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.

6.

Real Gases, Liquids, and Solutions . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The Ideal Molecular Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 The Hamiltonian and the Partition Function . . . . . . . . . 5.1.2 The Rotational Contribution . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 The Vibrational Contribution . . . . . . . . . . . . . . . . . . . . . . ∗ 5.1.4 The Inﬂuence of the Nuclear Spin . . . . . . . . . . . . . . . . . . ∗ 5.2 Mixtures of Ideal Molecular Gases . . . . . . . . . . . . . . . . . . . . . . . . 5.3 The Virial Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 The Classical Approximation for the Second Virial Coeﬃcient . . . . . . . . . . . . . . . . . . . . 5.3.3 Quantum Corrections to the Virial Coeﬃcients . . . . . . . 5.4 The Van der Waals Equation of State . . . . . . . . . . . . . . . . . . . . . 5.4.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 The Maxwell Construction . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3 The Law of Corresponding States . . . . . . . . . . . . . . . . . . 5.4.4 The Vicinity of the Critical Point . . . . . . . . . . . . . . . . . . . 5.5 Dilute Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 The Partition Function and the Chemical Potentials . . 5.5.2 Osmotic Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 5.5.3 Solutions of Hydrogen in Metals (Nb, Pd,...) . . . . . . . . . 5.5.4 Freezing-Point Depression, Boiling-Point Elevation, and Vapor-Pressure Reduction . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Magnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 The Density Matrix and Thermodynamics . . . . . . . . . . . . . . . . . 6.1.1 The Hamiltonian and the Canonical Density Matrix . . 6.1.2 Thermodynamic Relations . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.3 Supplementary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 The Diamagnetism of Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 The Paramagnetism of Non-coupled Magnetic Moments . . . . . 6.4 Pauli Spin Paramagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

206 206 209 211 213 213 215 217 221 225 225 225 227 230 232 234 236 236 238 241 242 242 247 251 251 257 257 261 262 263 266 269 269 269 273 276 278 280 284

Table of Contents

6.5 Ferromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 The Exchange Interaction . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2 The Molecular Field Approximation for the Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.3 Correlation Functions and Susceptibility . . . . . . . . . . . . . 6.5.4 The Ornstein–Zernike Correlation Function . . . . . . . . . . ∗ 6.5.5 Continuum Representation . . . . . . . . . . . . . . . . . . . . . . . . ∗ 6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.1 The Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.2 Thermodynamics and Magnetostatics . . . . . . . . . . . . . . . 6.6.3 Statistical–Mechanical Justiﬁcation . . . . . . . . . . . . . . . . . 6.6.4 Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Applications to Related Phenomena . . . . . . . . . . . . . . . . . . . . . . 6.7.1 Polymers and Rubber-like Elasticity . . . . . . . . . . . . . . . . 6.7.2 Negative Temperatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 6.7.3 The Melting Curve of 3 He . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.

Phase Transitions, Renormalization Group Theory, and Percolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Phase Transitions and Critical Phenomena . . . . . . . . . . . . . . . . . 7.1.1 Symmetry Breaking, the Ehrenfest Classiﬁcation . . . . . ∗ 7.1.2 Examples of Phase Transitions and Analogies . . . . . . . . 7.1.3 Universality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 The Static Scaling Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Thermodynamic Quantities and Critical Exponents . . . 7.2.2 The Scaling Hypothesis for the Correlation Function . . 7.3 The Renormalization Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 The One-Dimensional Ising Model, Decimation Transformation . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3 The Two-Dimensional Ising Model . . . . . . . . . . . . . . . . . . 7.3.4 Scaling Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 7.3.5 General RG Transformations in Real Space . . . . . . . . . . ∗ 7.4 The Ginzburg–Landau Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Ginzburg–Landau Functionals . . . . . . . . . . . . . . . . . . . . . 7.4.2 The Ginzburg–Landau Approximation . . . . . . . . . . . . . . 7.4.3 Fluctuations in the Gaussian Approximation . . . . . . . . . 7.4.4 Continuous Symmetry and Phase Transitions of First Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 7.4.5 The Momentum-Shell Renormalization Group . . . . . . . . ∗ 7.5 Percolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.1 The Phenomenon of Percolation . . . . . . . . . . . . . . . . . . . . 7.5.2 Theoretical Description of Percolation . . . . . . . . . . . . . . .

XV

287 287 289 300 301 305 307 307 308 312 316 317 317 320 323 325 331 331 331 332 338 339 339 343 345 345 346 349 356 359 361 361 364 366 373 380 387 387 391

XVI

Table of Contents

7.5.3 Percolation in One Dimension . . . . . . . . . . . . . . . . . . . . . . 7.5.4 The Bethe Lattice (Cayley Tree) . . . . . . . . . . . . . . . . . . . 7.5.5 General Scaling Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.6 Real-Space Renormalization Group Theory . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.

9.

Brownian Motion, Equations of Motion and the Fokker–Planck Equations . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Langevin Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 The Free Langevin Equation . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 The Langevin Equation in a Force Field . . . . . . . . . . . . . 8.2 The Derivation of the Fokker–Planck Equation from the Langevin Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 The Fokker–Planck Equation for the Langevin Equation (8.1.1) . . . . . . . . . . . . . . . . . . 8.2.2 Derivation of the Smoluchowski Equation for the Overdamped Langevin Equation, (8.1.23) . . . . . 8.2.3 The Fokker–Planck Equation for the Langevin Equation (8.1.22b) . . . . . . . . . . . . . . . . 8.3 Examples and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Integration of the Fokker–Planck Equation (8.2.6) . . . . 8.3.2 Chemical Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.3 Critical Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 8.3.4 The Smoluchowski Equation and Supersymmetric Quantum Mechanics . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Boltzmann Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Derivation of the Boltzmann Equation . . . . . . . . . . . . . . . . . . . . 9.3 Consequences of the Boltzmann Equation . . . . . . . . . . . . . . . . . 9.3.1 The H-Theorem and Irreversibility . . . . . . . . . . . . . . . . . ∗ 9.3.2 Behavior of the Boltzmann Equation under Time Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.3 Collision Invariants and the Local Maxwell Distribution . . . . . . . . . . . . . . . . . 9.3.4 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.5 The Hydrodynamic Equations in Local Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 9.4 The Linearized Boltzmann Equation . . . . . . . . . . . . . . . . . . . . . . 9.4.1 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.2 The Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.3 Eigenfunctions of L and the Expansion of the Solutions of the Boltzmann Equation . . . . . . . . . .

392 393 398 400 404

409 409 409 414 416 416 418 420 420 420 422 425 429 432 437 437 438 443 443 446 447 449 451 455 455 457 458

Table of Contents

XVII

9.4.4 The Hydrodynamic Limit . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.5 Solutions of the Hydrodynamic Equations . . . . . . . . . . . ∗ 9.5 Supplementary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.1 Relaxation-Time Approximation . . . . . . . . . . . . . . . . . . . 9.5.2 Calculation of W (v1 , v2 ; v1 , v2 ) . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

460 466 468 468 469 476

10. Irreversibility and the Approach to Equilibrium . . . . . . . . . . 10.1 Preliminary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Recurrence Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 The Origin of Irreversible Macroscopic Equations of Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 A Microscopic Model for Brownian Motion . . . . . . . . . . 10.3.2 Microscopic Time-Reversible and Macroscopic Irreversible Equations of Motion, Hydrodynamics . . . . . ∗ 10.4 The Master Equation and Irreversibility in Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Probability and Phase-Space Volume . . . . . . . . . . . . . . . . . . . . . . ∗ 10.5.1 Probabilities and the Time Interval of Large Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5.2 The Ergodic Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6 The Gibbs and the Boltzmann Entropies and their Time Dependences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6.1 The Time Derivative of Gibbs’ Entropy . . . . . . . . . . . . . 10.6.2 Boltzmann’s Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.7 Irreversibility and Time Reversal . . . . . . . . . . . . . . . . . . . . . . . . . 10.7.1 The Expansion of a Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.7.2 Description of the Expansion Experiment in µ-Space . . 10.7.3 The Inﬂuence of External Perturbations on the Trajectories of the Particles . . . . . . . . . . . . . . . . . ∗ 10.8 Entropy Death or Ordered Structures? . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

479 479 481

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Nernst’s Theorem (Third Law) . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1 Preliminary Remarks on the Historical Development of Nernst’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Nernst’s Theorem and its Thermodynamic Consequences . . . . . . . . . . . . . . A.3 Residual Entropy, Metastability, etc. . . . . . . . . . . . . . . . . B. The Classical Limit and Quantum Corrections . . . . . . . . . . . . . B.1 The Classical Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

484 484 490 491 494 494 497 498 498 498 500 500 505 506 507 509 513 513 513 514 516 521 521

XVIII Table of Contents

B.2 B.3

C. D. E. F. G. H.

I.

Calculation of the Quantum-Mechanical Corrections . . Quantum Corrections to the Second Virial Coeﬃcient B(T ) . . . . . . . . . . . . . . . The Perturbation Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Riemann ζ-Function and the Bernoulli Numbers . . . . . . . . Derivation of the Ginzburg–Landau Functional . . . . . . . . . . . . . The Transfer Matrix Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Integrals Containing the Maxwell Distribution . . . . . . . . . . . . . Hydrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H.1 Hydrodynamic Equations, Phenomenological Discussion . . . . . . . . . . . . . . . . . . . . . . H.2 The Kubo Relaxation Function . . . . . . . . . . . . . . . . . . . . . H.3 The Microscopic Derivation of the Hydrodynamic Equations . . . . . . . . . . . . . . . . . . . . Units and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

526 531 536 537 538 545 547 548 549 550 552 557

Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565

1. Basic Principles

1.1 Introduction Statistical mechanics deals with the physical properties of systems which consist of a large number of particles, i.e. many-body systems, and it is based on the microscopic laws of nature. Examples of such many-body systems are gases, liquids, solids in their various forms (crystalline, amorphous), liquid crystals, biological systems, stellar matter, the radiation ﬁeld, etc. Among their physical properties which are of interest are equilibrium properties (speciﬁc heat, thermal expansion, modulus of elasticity, magnetic susceptibility, etc.) and transport properties (thermal conductivity, electrical conductivity, etc.). Long before it was provided with a solid basis by statistical mechanics, thermodynamics had been developed; it yields general relations between the macroscopic parameters of a system. The First Law of Thermodynamics was formulated by Robert Mayer in 1842. It states that the energy content of a body consists of the sum of the work performed on it and the heat which is put into it: dE = δQ + δW .

(1.1.1)

The fact that heat is a form of energy, or more precisely, that energy can be transferred to a body in the form of heat, was tested experimentally by Joule in the years 1843–1849 (experiments with friction). The Second Law was formulated by Clausius and by Lord Kelvin (W. Thomson1 ) in 1850. It is based on the fact that a particular state of a thermodynamic system can be reached through diﬀerent ways of dividing up the energy transferred to it into work and heat, i.e. heat is not a “state variable” (a state variable is a physical quantity which is determined by the state of the system; this concept will be given a mathematically precise deﬁnition later). The essential new information in the Second Law was that there exists a state variable S, the entropy, which for reversible changes is related to the quantity of heat transferred by the equation 1

Born W. Thomson; the additional name was assumed later in connection with his knighthood, granted in recognition of his scientiﬁc achievements.

2

1. Basic Principles

δQ = T dS ,

(1.1.2)

while for irreversible processes, δQ < T dS holds. The Second Law is identical with the statement that a perpetual motion machine of the second kind is impossible to construct (this would be a periodically operating machine which performs work by only extracting heat from a single heat bath). The atomistic basis of thermodynamics was ﬁrst recognized in the kinetic theory of dilute gases. The velocity distribution derived by Maxwell (1831– 1879) permits the derivation of the caloric and thermal equation of state of ideal gases. Boltzmann (1844–1906) wrote the basic transport equation which bears his name in the year 1874. From it, he derived the entropy increase (H theorem) on approaching equilibrium. Furthermore, Boltzmann realized that the entropy depends on the number of states W (E, V, . . .) which are compatible with the macroscopic values of the energy E, the volume V, . . . as given by the relation S ∝ log W (E, V, . . .) .

(1.1.3)

It is notable that the atomistic foundations of the theory of gases were laid at a time when the atomic structure of matter had not yet been demonstrated experimentally; it was even regarded with considerable scepticism by well-known physicists such as E. Mach (1828–1916), who favored continuum theories. The description of macroscopic systems in terms of statistical ensembles was justiﬁed by Boltzmann on the basis of the ergodic hypothesis. Fundamental contributions to thermodynamics and to the statistical theory of macroscopic systems were made by J. Gibbs (1839–1903) in the years 1870–1900. Only after the formulation of quantum mechanics (1925) did the correct theory for the atomic regime become available. To distinguish it from classical statistical mechanics, the statistical mechanics based on the quantum theory is called quantum statistics. Many phenomena such as the electronic properties of solids, superconductivity, superﬂuidity, or magnetism can be explained only by applying quantum statistics. Even today, statistical mechanics still belongs among the most active areas of theoretical physics: the theory of phase transitions, the theory of liquids, disordered solids, polymers, membranes, biological systems, granular matter, surfaces, interfaces, the theory of irreversible processes, systems far from equilibrium, nonlinear processes, structure formation in open systems, biological processes, and at present still magnetism and superconductivity are ﬁelds of active interest. Following these remarks about the problems treated in statistical mechanics and its historical development, we now indicate some characteristic problems which play a role in the theory of macroscopic systems. Conventional macroscopic systems such as gases, liquids and solids at room temperature consist of 1019 –1023 particles per cm3 . The number of quantum-mechanical eigenstates naturally increases as the number of particles. As we shall see

1.1 Introduction

3

Fig. 1.1. Spacing of the energy levels for a large number of particles N .

later, the separation of the energy levels is of the order of e−N , i.e. the energy levels are so densely spaced that even the smallest perturbation can transfer the system from one state to another one which has practically the same energy. Should we now set ourselves the goal of calculating the motion of the 3N coordinates in classical physics, or the time dependence of the wavefunctions in quantum mechanics, in order to compute temporal averages from them? Both programs would be impossible to carry out and are furthermore unnecessary. One can solve neither Newton’s equations nor the Schr¨odinger equation for 1019 –1023 particles. And even if we had the solutions, we would not know all the coordinates and velocities or all the quantum numbers required to determine the initial values. Furthermore, the detailed time development plays no role for the macroscopic properties which are of interest. In addition, even the weakest interaction (external perturbation), which would always be present even with the best possible isolation of the system from its environment, would lead to a change in the microscopic state without aﬀecting the macroscopic properties. For the following discussion, we need to deﬁne two concepts. The microstate: it is deﬁned by the wavefunction of the system in quantum mechanics, or by all the coordinates and momenta of the system in classical physics. The macrostate: this is characterized by a few macroscopic quantities (energy, volume, . . .). From the preceding considerations it follows that the state of a macroscopic system must be described statistically. The fact that the system passes through a distribution of microstates during a measurement requires that we characterize the macrostate by giving the probabilities for the occurrence of particular microstates. The collection of all the microstates which represent a macrostate, weighted by their frequency of occurrence, is referred to as a statistical ensemble. Although the state of a macroscopic system is characterized by a statistical ensemble, the predictions of macroscopic quantities are precise. Their mean values and mean square deviations are both proportional to the number of particles N . The relative ﬂuctuations, i.e. the ratio of ﬂuctuations to mean values, tend towards zero in the thermodynamic limit (see (1.2.21c)).

4

1. Basic Principles

1.2 A Brief Excursion into Probability Theory At this point, we wish to collect a few basic mathematical deﬁnitions from probability theory, in order to derive the central limit theorem.2 1.2.1 Probability Density and Characteristic Functions We ﬁrst have to consider the meaning of the concept of a random variable. This refers to a quantity X which takes on values x depending upon the elements e of a “set of events” E. In each individual observation, the value of X is uncertain; instead, one knows only the probability for the occurrence of one of the possible results (events) from the set E. For example, in the case of an ideal die, the random variable is the number of spots, which can take on values between 1 and 6; each of these events has the probability 1/6. If we had precise knowledge of the initial position of the die and the forces acting on it during the throw, we could calculate the result from classical mechanics. Lacking such detailed information, we can make only the probability statement given above. Let e ∈ E be an event from the set E and Pe be its corresponding probability; then for a large number of attempts, N , the number of times Ne that the event e occurs is related to Pe by limN →∞ NNe = Pe . Let X be a random variable. If the values x which X can assume are continuously distributed, we deﬁne the probability density of the random variable to be w(x). This means that w(x)dx is the probability that X assumes a value in the interval [x, x + dx]. The total probability must be one, i.e. w(x) is normalized to one: +∞ dx w(x) = 1 . (1.2.1) −∞

Deﬁnition 1 : The mean value of X is deﬁned by +∞ dx w(x) x . X =

(1.2.2)

−∞

Now let F (X) be a function of the random variable X; one then calls F (X) a random function. Its mean value is deﬁned corresponding to (1.2.2) by3 F (X) = dx w(x)F (x) . (1.2.2 ) The powers of X have a particular importance: their mean values will be used to introduce the moments of the probability density. 2

3

See e.g.: W. Feller, An Introduction to Probability Theory and its Applications, Vol. I (Wiley, New York 1968). In the case that the limits of integration are not given, the integral is to be taken from −∞ to +∞. An analogous simpliﬁed notation will also be used for integrals over several variables.

1.2 A Brief Excursion into Probability Theory

5

Deﬁnition 2 : The nth moment of the probability density w(x) is deﬁned as µn = X n .

(1.2.3)

(The ﬁrst moment of w(x) is simply the mean value of X.) Deﬁnition 3 : The mean square deviation (or variance) is deﬁned by 2 2 2 (∆x) = X 2 − X = X − X .

(1.2.4)

Its square root is called the root-mean-square deviation or standard deviation. Deﬁnition 4 : Finally, we deﬁne the characteristic function: χ(k) = dx e−ikx w(x) ≡ e−ikX .

(1.2.5)

By taking its inverse Fourier transform, w(x) can be expressed in terms of χ(k): dk ikx w(x) = e χ(k) . (1.2.6) 2π Under the assumption that all the moments of the probability density w(x) exist, it follows from Eq. (1.2.5) that the characteristic function is χ(k) =

(−ik)n n

n!

X n .

(1.2.7)

If X has a discrete spectrum of values, i.e. the values ξ1 , ξ2 , . . . can occur with probabilities p1 , p2 , . . ., the probability density has the form w(x) = p1 δ(x − ξ1 ) + p2 δ(x − ξ2 ) + . . . .

(1.2.8)

Often, the probability density will have discrete and continuous regions. In the case of multidimensional systems (those with several components) X = (X1 , X2 , . . .), let x = (x1 , x2 , . . .) be the values taken on by X. Then the probability density (also called the joint probability density) is w(x) and it has the following signiﬁcance: w(x)dx ≡ w(x)dx1 dx2 . . . dxN is the probability of ﬁnding x in the hypercubic element x, x + dx. We will also use the term probability distribution or, for short, simply the distribution. Deﬁnition 5 : The mean value of a function F (X) of the random variables X is deﬁned by F (X) = dx w(x)F (x) . (1.2.9) Theorem: The probability density of F (X) A function F of the random variables X is itself a random variable, which can take on the values f corresponding to a probability density wF (f ). The

6

1. Basic Principles

probability density wF (f ) can be calculated from the probability density w(x). We assert that: wF (f ) = δ(F (X) − f ) .

(1.2.10)

Proof : We express the probability density wF (f ) in terms of its characteristic function dk ikf (−ik)n n e F . wF (f ) = 2π n! n If we insert F n = dx w(x)F (x)n , we ﬁnd dk ikf e dx w(x)e−ikF (x) wF (f ) = 2π

and, making use of the Fourier representation of the δ-function δ(y) = dk after iky , we ﬁnally obtain 2π e wF (f ) = dx w(x)δ(f − F (x)) = δ(F (X) − f ) , i.e. Eq. (1.2.10). Deﬁnition 6 : For multidimensional systems we deﬁne correlations Kij = (Xi − Xi )(Xj − Xj )

(1.2.11)

of the random variables Xi and Xj . These indicate to what extent ﬂuctuations (deviations from the mean value) of Xi and Xj are correlated. If the probability density has the form w(x) = wi (xi )w ({xk , k = i}) , where w ({xk , k = i}) does not depend on xi , then Kij = 0 for j = i, i.e. Xi and Xj are not correlated. In the special case w(x) = w1 (x1 ) · · · wN (xN ) , the stochastic variables X1 , . . . , XN are completely uncorrelated. Let Pn (x1 , . . . , xn−1 , xn ) be the probability density of the random variables X1 , . . . , Xn−1 , Xn . Then the probability density for a subset of these random variables is given by integration of Pn over the range of values of the remaining random variables; e.g. the probability density Pn−1 (x1 , . . . , xn−1 ) for the random variables X1 , . . . , Xn−1 is Pn−1 (x1 , . . . , xn−1 ) = dxn Pn (x1 , . . . , xn−1 , xn ) . Finally, we introduce the concept of conditional probability and the conditional probability density.

1.2 A Brief Excursion into Probability Theory

7

Deﬁnition 7: Let Pn (x1 , . . . , xn ) be the probability (density). The conditional probability (density) Pk|n−k (x1 , . . . , xk |xk+1 , . . . , xn ) is deﬁned as the probability (density) of the random variables x1 , . . . , xk , if the remaining variables xk+1 , . . . , xn have given values. We ﬁnd Pk|n−k (x1 , . . . , xk |xk+1 , . . . , xn ) = where Pn−k (xk+1 , . . . , xn ) =

Pn (x1 , . . . , xn ) , Pn−k (xk+1 , . . . , xn )

(1.2.12)

dx1 . . . dxk Pn (x1 , . . . , xn ) .

Note concerning conditional probability: formula (1.2.12) is usually introduced as a deﬁnition in the mathematical literature, but it can be deduced in the following way, if one identiﬁes the probabilities with statistical frequencies: Pn (x1 , . . . , xk , xk+1 , . . . , xn ) for ﬁxed xk+1 , . . . , xn determines the frequencies of the x1 , . . . , xk with given values of xk+1 , . . . , xn . The probability density which corresponds to these Rfrequencies is therefore proportional to Pn (x1 , . . . , xk , xk+1 , . . . , xn ). Since dx1 . . . dxk Pn (x1 , . . . , xk , xk+1 , . . . , xn ) = Pn−k (xk+1 , . . . , xn ), the conditional probability density normalized to one is then Pk|n−k (x1 , . . . , xk |xk+1 , . . . , xn ) =

Pn (x1 , . . . , xn ) . Pn−k (xk+1 , . . . , xn )

1.2.2 The Central Limit Theorem Let there be mutually independent random variables X1 , X2 , . . . , XN which are characterized by common but independent probability distributions w(x1 ), w(x2 ), . . . , w(xN ). Suppose that the mean value and the variance of X1 , X2 , . . ., XN exist. We require the probability density for the sum Y = X1 + X2 + . . . + XN

(1.2.13)

in the limit N → ∞. As we shall see, the probability density for Y is given by a Gaussian distribution. Examples of applications of this situation are a) A system of non-interacting particles Xi = energy of the i-th particle, Y = total energy of the system b) The random walk Xi = distance covered in the i-th step, Y = location after N steps. In order to carry out the computation of the probability density of Y in a convenient way, it is expedient to introduce the random variable Z: √ √ Xi − X / N = Y − N X / N , (1.2.14) Z= i

where X ≡ X1 = . . . = XN by deﬁnition.

8

1. Basic Principles

From (1.2.10), the probability density wZ (z) of the random variables Z is given by x1 + . . . + xN √ √ wZ (z) = dx1 . . . dxN w(x1 ) . . . w(xN ) δ z − + N X N √ −ik(x1 +...+xN ) dk ikz √ +ik N X N e = dx1 . . . dxN w(x1 ) . . . w(xN )e 2π

N dk ikz+ik√NX k e χ √ = , (1.2.15) 2π N where χ(q) is the characteristic function of w(x). The representation (1.2.7) of the characteristic function in terms of the moments of the probability density can be reformulated by taking the logarithm of the expansion in moments,

1 2 χ(q) = exp −iqX − q 2 (∆x) + . . . q 3 + . . . , (1.2.16) 2 i.e. in general ∞ n (−iq) Cn . χ(q) = exp (1.2.16 ) n! n=1 In contrast to (1.2.7), in (1.2.16 ) the logarithm of the characteristic function is expanded in a power series. The expansion coeﬃcients Cn which occur in this series are called cumulants of the nth order . They can be expressed in terms of the moments (1.2.3); the three lowest take on the forms: C1 = X = µ1 2 2 C2 = (∆x) = X 2 − X = µ2 − µ21 2 3 3 C3 = X − 3 X X + 2X = µ3 − 3µ1 µ2 + 2µ31 .

(1.2.17)

The relations (1.2.17) between the cumulants and the moments can be obtained by expanding the exponential function in (1.2.16) or in (1.2.16 ) and comparing the coeﬃcients of the Taylor series with (1.2.7). Inserting (1.2.16) into (1.2.15) yields dk ikz− 1 k2 (∆x)2 +...k3 N − 21 +... 2 wZ (z) = e . (1.2.18) 2π √ From this, neglecting the terms which vanish for large N as 1/ N or more rapidly, we obtain z2 2 −1/2 − 2(∆x) 2 wZ (z) = 2π(∆x) e (1.2.19) and ﬁnally, using WY (y)dy = WZ (z)dz for the probability density of the random variables Y , (y−XN )2 2 −1/2 − 2(∆x)2 N wY (y) = 2πN (∆x) e .

(1.2.20)

1.3 Ensembles in Classical Statistics

9

This is the central limit theorem: wY (y) is a Gaussian distribution, although we did not in any way assume that w(x) was such a distribution, Y = N X √ ∆y = ∆x N √ ∆x N ∆x ∆y √ . = = relative deviation: Y N X X N mean value:

(1.2.21a)

standard deviation:

(1.2.21b) (1.2.21c)

The central limit theorem provides the mathematical basis for the fact that in the limiting case of large N , predictions about Y become sharp. From (1.2.21c), the relative deviation, i.e. the ratio of the standard deviation to the mean value, approaches zero in the limit of large N .

1.3 Ensembles in Classical Statistics Although the correct theory in the atomic regime is based on quantum mechanics, and classical statistics can be derived from quantum statistics, it is more intuitive to develop classical statistics from the beginning, in parallel to quantum statistics. Later, we shall derive the classical distribution function within its range of validity from quantum statistics. 1.3.1 Phase Space and Distribution Functions We consider N particles in three dimensions with coordinates q1 , . . . , q3N and momenta p1 , . . . , p3N . Let us deﬁne phase space, also called Γ space, as the space which is spanned by the 6N coordinates and momenta. A microscopic state is represented by a point in the Γ space and the motion of the overall system by a curve in phase space (Fig. 1.2), which is also termed a phasespace orbit or phase-space trajectory. As an example, we consider the one-dimensional harmonic oscillator q = q0 cos ωt p = −mq0 ω sin ωt ,

(1.3.1)

whose orbit in phase space is shown in Fig. 1.3. For large N , the phase space is a space of many dimensions. As a rule, our knowledge of such a system is not suﬃcient to determine its position in phase space. As already mentioned in the introductory section 1.1, a macrostate characterized by macroscopic values such as that of its energy E, volume V , number of particles N etc., can be generated equally well by any one of a large number of microstates, i.e. by a large number of points in phase space. Instead of singling out just one of these microstates arbitrarily, we consider all of them, i.e. an ensemble of systems which all represent one and the same macrostate but which contains all of the corresponding possible microstates.

10

1. Basic Principles

Fig. 1.2. A trajectory in phase space. Here, q and p represent the 6N coordinates and momenta q1 , . . . , q3N and p1 , . . . , p3N .

Fig. 1.3. The phase-space orbit of the one-dimensional harmonic oscillator.

The weight with which a point (q, p) ≡ (q1 , . . . , q3N , p1 , . . . , p3N ) occurs at the time t is given by the probability density ρ(q, p, t). The introduction of this probability density is now not at all just an expression of our lack of knowledge of the detailed form of the microstates, but rather it has the following physical basis: every realistic macroscopic system, even with the best insulation from its surroundings, experiences an interaction with its environment. This interaction is to be sure so weak that it does not aﬀect the macroscopic properties of the system, i.e. the macrostate remains unchanged, but it induces the system to change its microstate again and again and thus causes it for example to pass through a distribution of microstates during a measurement process. These states, which are occupied during a short time interval, are collected together in the distribution ρ(q, p). This distribution thus describes not only the statistical properties of a ﬁctitious ensemble of many copies of the system considered in its diverse microstates, but also each individual system. Instead of considering the sequential stochastic series of these microstates in terms of time-averaged values, we can observe the simultaneous time development of the whole ensemble. It will be a major task in the following chapter to determine the distribution functions which correspond to particular physical situations. To this end, knowledge of the equation of motion which we derive in the next section will prove to be very important. For large N , we know only the probability distribution ρ(q, p, t). Here,

1.3 Ensembles in Classical Statistics

ρ(q, p, t)dqdp ≡ ρ(q1 , . . . , q3N , p1 , . . . , p3N , t)

3N

dqi dpi

11

(1.3.2)

i=1

is the probability of ﬁnding a system of the ensemble (or the individual systems in the course of the observation) at time t within the phase-space volume element dqdp in the neighborhood of the point q, p in Γ space. ρ(q, p, t) is called the distribution function. It must be positive, ρ(q, p, t) ≥ 0, and normalizable. Here, q, p stand for the whole of the coordinates and momenta q1 , . . . , q3N , p1 , . . . , p3N . 1.3.2 The Liouville Equation We now wish to determine the time dependence of ρ(q, p, t), beginning with the initial distribution W (q0 , p0 ) at time t = 0 on the basis of the classical Hamiltonian H. We shall assume that the system is closed. The following results are however also valid when H contains time-dependent external forces. We ﬁrst consider a system whose coordinates in phase space at t = 0 are q0 and p0 . The associated trajectory in phase space, which follows from the Hamiltonian equations of motion, is denoted by q(t; q0 , p0 ), p(t; q0 , p0 ), with the intitial values of the trajectories given here explicitly. For a single trajectory, the probability density of the coordinates q and the momenta p has the form δ q − q(t; q0 , p0 ) δ p − p(t; q0 , p0 ) . (1.3.3) Here, δ(k) ≡ δ(k1 ) . . . δ(k3N ). The initial values are however in general not precisely known; instead, there is a distribution of values, W (q0 , p0 ). In this case, the probability density in phase space at the time t is found by multiplication of (1.3.3) by W (q0 , p0 ) and integration over the initial values: ρ(q, p, t) = dq0 dp0 W (q0 , p0 )δ q −q(t; q0 , p0 ) δ p−p(t; q0 , p0 ) . (1.3.3 ) We wish to derive an equation of motion for ρ(q, p, t). To this end, we use the Hamiltonian equations of motion q˙i =

∂H ∂H , p˙ i = − . ∂pi ∂qi

The velocity in phase space

∂H ∂H ,− v = (q, ˙ p) ˙ = ∂p ∂q fulﬁlls the equation 2 ∂ q˙i ∂ H ∂ p˙ i ∂2H = =0. + − div v ≡ ∂qi ∂pi ∂qi ∂pi ∂pi ∂qi i i

(1.3.4)

(1.3.4 )

(1.3.5)

12

1. Basic Principles

That is, the motion in phase space can be treated intuitively as the “ﬂow” of an incompressible “ﬂuid”. Taking the time derivative of (1.3.3 ), we ﬁnd ∂ρ(q, p, t) ∂t

∂ ∂ dq0 dp0 W (q0 , p0 ) q˙i (t; q0 , p0 ) + p˙ i (t; q0 , p0 ) =− ∂qi ∂pi i × δ q − q(t; q0 , p0 ) δ p − p(t; q0 , p0 ) . (1.3.6) Expressing the velocity in phase space in terms of (1.3.4), employing the δ-functions in (1.3.6), and ﬁnally using (1.3.3 ) and (1.3.5), we obtain the following representations of the equation of motion for ρ(q, p, t): ∂ ∂ρ ∂ =− ρq˙i + ρp˙ i ∂t ∂qi ∂pi i ∂ρ ∂ρ q˙i + p˙ i =− (1.3.7) ∂qi ∂pi i ∂ρ ∂H ∂ρ ∂H − . + = ∂qi ∂pi ∂pi ∂qi i Making use of the Poisson bracket notation4 , the last line of Eq. (1.3.7) can also be written in the form ∂ρ = − {H, ρ} ∂t

(1.3.8)

This is the Liouville equation, the fundamental equation of motion of the classical distribution function ρ(q, p, t). Additional remarks: We discuss some equivalent representations of the Liouville equation and their consequences. (i) The ﬁrst line of the series of equations (1.3.7) can be written in abbreviated form as an equation of continuity ∂ρ = − div vρ . ∂t

(1.3.9)

One can imagine the motion of the ensemble in phase space to be like the ﬂow of a ﬂuid. Then (1.3.9) is the equation of continuity for the density and Eq. (1.3.5) shows that the ﬂuid is incompressible. 4

{u, v} ≡

P h ∂u i

∂v ∂pi ∂qi

−

∂u ∂v ∂qi ∂pi

i

1.3 Ensembles in Classical Statistics

13

(ii) We once more take up the analogy of motion in phase space to ﬂuid hydrodynamics: in our previous discussion, we considered the density at a ﬁxed point q, p in Γ space. However, we could also consider the motion from the point of view of an observer moving with the “ﬂow”, i.e. we could ask for the time dependence of ρ(q(t), p(t), t) (omitting the initial values of the coordinates, q0 and p0 , for brevity). The second line of Eq. (1.3.7) can also be expressed in the form d ρ q(t), p(t), t = 0 . dt

(1.3.10)

Hence, the distribution function is constant along a trajectory in phase space. (iii) We now investigate the change of a volume element dΓ in phase space. At t = 0, let a number dN of representatives of the ensemble be uniformly distributed within a volume element dΓ0 . Owing to the motion in phase space, they occupy a volume dΓ at the time t. This means that the density ρ at dN t = 0 is given by dΓ , while at time t, it is dN dΓ . From (1.3.10), the equality of 0 these two quantities follows, from which we ﬁnd (Fig. 1.4) that their volumes are the same: dΓ = dΓ0 .

(1.3.11)

Equation (1.3.8) is known in mechanics as the Liouville theorem.5 There, it is calculated from the Jacobian with the aid of the theory of canonical transformations. Reversing this process, we can begin with Eq. (1.3.11) and derive Eq. (1.3.10) and die Liouville equation (1.3.8).

Fig. 1.4. The time dependence of an element in phase space; its volume remains constant.

5

L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics I: Mechanics, Eq. (46.5), Pergamon Press (Oxford, London, Paris 1960)

14

1. Basic Principles

1.4 Quantum Statistics 1.4.1 The Density Matrix for Pure and Mixed Ensembles6 The density matrix is of special importance in the formulation of quantum statistics; it can also be denoted by the terms ‘statistical operator’ and ‘density operator’. Let a system be in the state |ψ. The observable A in this state has the mean value or expectation value A = ψ| A |ψ .

(1.4.1)

The structure of the mean value makes it convenient to deﬁne the density matrix by ρ = |ψ ψ| .

(1.4.2)

We then have: A = Tr(ρA) 2

(1.4.3a)

†

Tr ρ = 1 , ρ = ρ , ρ = ρ .

(1.4.3b,c,d)

Here, the deﬁnition of the trace (Tr) is n| X |n , Tr X =

(1.4.4)

n

where {|n} is an arbitrary complete orthonormal basis system. Owing to Tr X = n|m m| X |n = m| X |n n|m n

=

m

m

n

m| X |m ,

m

the trace is independent of the basis used. n.b. Proofs of (1.4.3a–c): X X Tr ρA = n|ψ ψ| A |n = ψ| A |n n|ψ = ψ| A |ψ , n

n

Tr ρ = Tr ρ11 = ψ| 11 |ψ = 1 , ρ2 = |ψ ψ|ψ ψ| = |ψ ψ| = ρ .

If the systems or objects under investigation are all in one and the same state |ψ, we speak of a pure ensemble, or else we say that the systems are in a pure state. 6

See e.g. F. Schwabl, Quantum Mechanics, 3rd edition, Springer, Heidelberg, Berlin, New York 2002 (corrected printing 2005), Chap. 20. In the following, this textbook will be abbreviated as ‘QM I’.

1.4 Quantum Statistics

15

Along with the statistical character which is inherent to quantum-mechanical systems, in addition a statistical distribution of states can be present in an ensemble. If an ensemble contains diﬀerent states, we call it a mixed ensemble, a mixture, or we speak of a mixed state. We assume that the state |ψ1 occurs with the probability p1 , the state |ψi with the probability pi , etc., with pi = 1 . i

The mean value or expectation value of A is then pi ψi | A |ψi . A =

(1.4.5)

i

This mean value can also be represented in terms of the density matrix deﬁned by ρ= pi |ψi ψi | . (1.4.6) i

We ﬁnd: A = Tr ρA

(1.4.7a)

Tr ρ = 1

(1.4.7b)

ρ2 = ρ

and Tr ρ2 < 1, in the case that pi = 0 for more than one i (1.4.7c)

ρ† = ρ .

(1.4.7d)

The derivations of these relations and further remarks about the density matrices of mixed ensembles will be given in Sect. 1.5.2. 1.4.2 The Von Neumann Equation From the Schr¨ odinger equation and its adjoint i

∂ |ψ, t = H |ψ, t , ∂t

−i

∂ ψ, t| = ψ, t| H , ∂t

it follows that i

∂ ρ = i pi |ψ˙i ψi | + |ψi ψ˙ i | ∂t i = pi (H |ψi ψi | − |ψi ψi | H) . i

16

1. Basic Principles

From this, we ﬁnd the von Neumann equation, ∂ i ρ = − [H, ρ] ; ∂t

(1.4.8)

it is the quantum-mechanical equivalent of the Liouville equation. It describes the time dependence of the density matrix in the Schr¨odinger representation. It holds also for a time-dependent H. It should not be confused with the equation of motion of Heisenberg operators, which has a positive sign on the right-hand side. The expectation value of an observable A is given by At = Tr ρ(t)A , (1.4.9) where ρ(t) is found by solving the von Neumann equation (1.4.8). The time dependence of the expectation value is referred to by the index t. We shall meet up with the von Neumann equation in the next chapter where we set up the equilibrium density matrices, and it is naturally of fundamental importance for all time-dependent processes. We now treat the transformation to the Heisenberg representation. The formal solution of the Schr¨ odinger equation has the form |ψ(t) = U (t, t0 ) |ψ(t0 ) ,

(1.4.10)

where U (t, t0 ) is a unitary operator and |ψ(t0 ) is the initial state at the time t0 . From this we ﬁnd the time dependence of the density matrix: ρ(t) = U (t, t0 )ρ(t0 )U (t, t0 )† .

(1.4.11)

(For a time-independent H, U (t, t0 ) = e−iH(t−t0 )/ .) The expectation value of an observable A can be computed both in the Schr¨ odinger representation and in the Heisenberg representation At = Tr ρ(t)A = Tr ρ(t0 )U (t, t0 )† AU (t, t0 ) = Tr ρ(t0 )AH (t) . (1.4.12) Here, AH (t) = U † (t, t0 )AU (t, t0 ) is the operator in the Heisenberg representation. The density matrix ρ(t0 ) in the Heisenberg representation is timeindependent.

∗ ∗

1.5 Additional Remarks

1.5.1 The Binomial and the Poisson Distributions

We now discuss two probability distributions which occur frequently. Let us consider an interval of length L which is divided into two subintervals [0, a] and [a, L]. We now distribute N distinguishable objects (‘particles’) in

∗

1.5 Additional Remarks

17

a completely random way over the two subintervals, so that the probability that be found in the ﬁrst or the second subinterval is given by La a particle a or 1 − L . The probability that n particles are in the interval [0, a] is then given by the binomial distribution7

a n a N −n N , (1.5.1) 1− wn = L L n where the combinatorial factor N n gives the number of ways of choosing n objects from a set of N . The mean value of n is n =

N

nwn =

n=0

a N L

and its mean square deviation is a a 2 (∆n) = 1− N. L L

(1.5.2a)

(1.5.2b)

We now the limiting case L a. Initially, wn can be written consider N ·(N −1)···(N −n+1) = in the form using N n n! n

1 n−1 a N −n 1 aN 1· 1− ··· 1 − wn = 1− L L n! N N (1.5.3a)

N 1 n−1 1 · (1 − N ) · · · (1 − N ) n n 1 1− , =n n n! N (1 − La ) where for the mean value (1.5.2a), we have introduced the abbreviation a n = aN L . In the limit L → 0, N → ∞ for ﬁnite n, the third factor in (1.5.3a) −n becomes e and the last factor becomes equal to one, so that for the probability distribution, we ﬁnd: wn =

n n −n e . n!

(1.5.3b)

This is the Poisson distribution, which is shown schematically in Fig. 1.5. The Poisson distribution has the following properties: wn = 1 , n = n , (∆n)2 = n . (1.5.4a,b,c) n

The ﬁrst two relations follow immediately from the derivation of the Poisson distribution starting from the binomial distribution. They are obtained in problem 1.5 together with 1.5.4c directly from 1.5.3b. The relative deviation 7

A particular arrangement with n particles in the interval a and N − n in L − a, e.g. the ﬁrst particle in a, the second in L − a, the third in L − a, etc., has the ´N−n ` a ´n ` 1 − Lb . From this we obtain wn through multiplication probability L ` ´ . by the number of combinations, i.e. the binomial coeﬃcient N n

18

1. Basic Principles

Fig. 1.5. The Poisson distribution

is therefore 1 ∆n = 1/2 . n n

(1.5.5)

1 For numbers n which are not too large, e.g. n = 100, ∆n = 10 and ∆n n = 10 . ∆n 20 10 −10 For macroscopic systems, e.g. n = 10 , we have ∆n = 10 and n = 10 . The relative deviation becomes extremely small. For large n, the distribution wn is highly concentrated around n. The probability that no particles at 20 all are within the subsystem, i.e. w0 = e−10 , is vanishingly small. The number of particles in the subsystem [0, a] is not ﬁxed, but however its relative deviation is very small for macroscopic subsystems. In the ﬁgure below (Fig. 1.6a), the binomial distribution for N = 5 and a 3 = L 10 (and thus n = 1.5) is shown and compared to the Poisson distribu3 tion for n = 1.5; in b) the same is shown for N ≡ 10, La = 20 (i.e. again n = 1.5). Even with these small values of N , the Poisson distribution already approximates the binomial distribution rather well. With N = 100, the curves representing the binomial and the Poisson distributions would overlap completely.

Fig. 1.6. Comparison of the Poisson distribution and the binomial distribution

∗

1.5 Additional Remarks

19

∗

1.5.2 Mixed Ensembles and the Density Matrix of Subsystems (i) Proofs of (1.4.7a–d) Tr ρA = pi ψi | A |n n|ψi = pi ψi | A |ψi = A . n

i

i

From this, (1.4.7b) also follows using A = 1. pi pj |ψi ψi |ψj ψj | = ρ . ρ2 = i

j

For arbitrary |ψ, the expectation value of ρ 2 ψ| ρ |ψ = pi |ψ|ψi | ≥ 0 i

is positive deﬁnite. Since ρ is Hermitian, the eigenvalues Pm of ρ are positive and real: ρ |m = Pm |m ρ=

∞

Pm |m m| ,

(1.5.6)

m=1 ∞

Pm ≥ 0,

m|m = δmm .

Pm = 1,

m=1

2 2 In this basis, ρ2 = m Pm |m m| and, clearly, Trρ2 = m Pm < 1, if more than only one state occurs. One can also derive (1.4.7c) directly from (1.4.6), with the condition that at least two diﬀerent but not necessarily orthogonal states must occur in (1.4.6): Tr ρ2 = pi pj ψi |ψj ψj |n n|ψi n

=

i,j

i,j 2

pi pj |ψi |ψj |

, ∆x, < X 4 >, and < X− < X 3 >>.

1.15 The log-normal distribution: Let the statistical variables X have the property that log X obeys a Gaussian distribution with < log X >= log x0 . (a) Show by transforming the Gaussian distribution that the probability density for X has the form P (x) = √

2 0 )) 1 − (log(x/x 1 2σ2 , 0 < x < ∞. e 2 2πσ x

(b) Show that < X >= x0 eσ

2

/2

and < log X >= log x0 . (c) Show that the log-normal distribution can be rewritten in the form P (x) =

1 √ (x/x0 )−1−µ(x) x0 2πσ 2

with µ(x) =

1 x log ; 2σ 2 x0

it can thus be easily confused with a power law when analyzing data.

2. Equilibrium Ensembles

2.1 Introductory Remarks As emphasized in the Introduction, a macroscopic system consists of 1019 − 1023 particles and correspondingly has an energy spectrum with spacings of ∆E ∼ e−N . The attempt to ﬁnd a detailed solution to the microscopic equations of motion of such a system is hopeless; furthermore, the required initial conditions or quantum numbers cannot even be speciﬁed. Fortunately, knowledge of the time development of such a microstate is also superﬂuous, since in each observation of the system (both of macroscopic quantities and of microscopic properties, e.g. the density correlation function, particle diﬀusion, etc.), one averages over a ﬁnite time interval. No system can be strictly isolated from its environment, and as a result it will undergo transitions into many diﬀerent microstates during the measurement process. Figure 2.1 illustrates schematically how the system moves between various phase-space trajectories. Thus, a many-body system cannot be characterized by a single microstate, but rather by an ensemble of microstates. This statistical ensemble of microstates represents the macrostate which is speciﬁed by the macroscopic state variables E, V, N, . . .1 (see Fig. 2.1). p

q 1

Fig. 2.1. A trajectory in phase space (schematic)

A diﬀerent justiﬁcation of the statistical description is based on the ergodic theorem: nearly every microstate approaches arbitrarily closely to all the states of the corresponding ensemble in the course of time. This led Boltzmann to postulate that the time average for an isolated system is equal to the average over the states in the microcanonical ensemble (see Sect. 10.5.2).

26

2. Equilibrium Ensembles

Experience shows that every macroscopic system tends with the passage of time towards an equilibrium state, in which i ρ˙ = 0 = − [H, ρ]

(2.1.1)

must hold. Since, according to Eq. (2.1.1), in equilibrium the density matrix ρ commutes with the Hamiltonian H, it follows that in an equilibrium ensemble ρ can depend only on the conserved quantities. (The system changes its microscopic state continually even in equilibrium, but the distribution of microstates within the ensemble becomes time-independent.) Classically, the right-hand side of (2.1.1) is to be replaced by the Poisson bracket.

2.2 Microcanonical Ensembles 2.2.1 Microcanonical Distribution Functions and Density Matrices We consider an isolated system with a ﬁxed number of particles, a ﬁxed volume V , and an energy lying within the interval [E, E + ∆] with a small ∆, whose Hamiltonian is H(q, p) (Fig. 2.2). Its total momentum and total angular momentum may be taken to be zero.

Fig. 2.2. Energy shell in phase space

We now wish to ﬁnd the distribution function (density matrix) for this physical situation. It is clear from the outset that only those points in phase space which lie between the two hypersurfaces H(q, p) = E and H(q, p) = E + ∆ can have a ﬁnite statistical weight. The region of phase space between the hypersurfaces H(q, p) = E and H(q, p) = E + ∆ is called the energy shell. It is intuitively plausible that in equilibrium, no particular region of the energy shell should play a special role, i.e. that all points within the energy shell should have the same statistical weight. We can indeed derive this fact by making use of the conclusion following (2.1.1). If regions within the energy shell had diﬀerent statistical weights, then the distribution function (density matrix) would depend on other quantities besides H(q, p), and ρ would not commute with H (classically, the Poisson bracket would not vanish). Since

2.2 Microcanonical Ensembles

27

for a given E, ∆, V , and N , the equilibrium distribution function depends only upon H(q, p), it follows that every state within the energy shell, i.e. all of the points in Γ space with E ≤ H(q, p) ≤ E + ∆, are equally probable. An ensemble with these properties is called a microcanonical ensemble. The associated microcanonical distribution function can be postulated to have the form 1 E ≤ H(q, p) ≤ E + ∆ ρM C = Ω (E)∆ (2.2.1) 0 otherwise , where, as postulated, the normalization constant Ω (E) depends only on E, but not on q and p. Ω (E)∆ is the volume of the energy shell.2 In the limit ∆ → 0, (2.2.1) becomes ρM C =

1 δ E − H(q, p) . Ω (E)

The normalization of the probability density determines Ω (E): dq dp ρM C = 1 . h3N N ! The mean value of a quantity A is given by dq dp A = ρM C A . h3N N !

(2.2.1 )

(2.2.2)

(2.2.3)

The choice of the fundamental integration variables (whether q or q/const) is arbitrary at the present stage of our considerations and was made in (2.2.2) and (2.2.3) by reference to the limit which is found from quantum statistics. If the factor (h3N N !)−1 were not present in the normalization condition (2.2.2) and in the mean value (2.2.3), then ρM C would be replaced by (h3N N !)−1 ρM C . All mean values would remain unchanged in this case; the diﬀerence however would appear in the entropy (Sect. 2.3). The factor 1/N ! results from the indistinguishability of the particles. The necessity of including the factor 1/N ! was discovered by Gibbs even before the development of quantum mechanics. Without this factor, an entropy of mixing of identical gases would erroneously appear (Gibbs’ paradox). That is, the sum of the entropies of two identical ideal gases each consisting of N particles, 2SN , would be smaller than the entropy of one gas consisting of 2N particles. Mixing of ideal gases will be treated in Chap. 3, Sect. 3.6.3.4. We also refer to the calculation of the entropy of mixtures of ideal gases in Chap. 5 and the last paragraph of Appendix B.1. 2

The surface area of the energy shell Ω (E) depends not only on the energy E but also on the spatial volume V and the number of particles N . For our present considerations, only its dependence on E is of interest; therefore, for clarity and brevity, we omit the other variables. We use a similar abbreviated notation for the partition functions which will be introduced in later sections, also. The complete dependences are collected in Table 2.1.

28

2. Equilibrium Ensembles

For the 6N -dimensional volume element in phase space, we will also use the abbreviated notation dΓ ≡

dq dp . h3N N !

From the normalization condition, (2.2.2), and the limiting form given in (2.2.1 ), it follows that dq dp δ E − H(q, p) . (2.2.4) Ω (E) = 3N h N! After introducing coordinates on the energy shell and an integration variable along the normal k⊥ , (2.2.4) can also be given in terms of the surface integral: Ω (E) =

dS h3N N !

dk⊥ δ E − H(SE ) − |∇H|k⊥

E

=

dS h3N N !

1 . (2.2.4 ) |∇H(q, p)|

Here, dS is the diﬀerential element of surface area in the (6N −1)-dimensional hypersurface at energy E, and ∇ is the 6N -dimensional gradient in phase space. In Eq. (2.2.4 ) we have used H(SE ) = E and performed the integration over k⊥ . According to Eq. (1.3.4 ), it holds that |∇ H(q, p)| = |v| and the velocity in phase space is perpendicular to the gradient, i.e. v ⊥ ∇ H(q, p). This implies that the velocity is always tangential to the surface of the energy shell; cf. problem 1.8. Notes: (i) Alternatively, the expression (2.2.4 ) can be readily proven by starting with an energy shell of ﬁnite width ∆ and dividing it into segments dS∆k⊥ . Here, dS is a surface element and ∆k⊥ is the perpendicular distance between the two hypersurfaces (Fig. 2.3). Since the gradient yields the variation perpendicular to an equipotential surface, we ﬁnd |∇H(q, p)|∆k⊥ = ∆, where ∇H(q, p) is to be computed on the hypersurface H(q, p) = E.

Fig. 2.3. Calculation of the volume of the energy shell

2.2 Microcanonical Ensembles

29

From this it follows that Z Z dS dS = Ω (E)∆ = ∆k ·∆, ⊥ h3N N ! h3N N !|∇H(q, p)| i.e. we again obtain (2.2.4 ). (ii) Equation (2.2.4 ) has an intuitively very clear signiﬁcance. Ω (E) is given by the sum of the surface elements, each divided by the velocity in phase space. Regions with high velocity thus contribute less to Ω (E). In view of the ergodic hypothesis (see 10.5.2), this result is very plausible. See problem 1.8: |v| = |∇H| and v ⊥ ∇H.

As already mentioned, Ω (E)∆ is the volume of the energy shell in classical statistical mechanics. We will occasionally also refer to Ω (E) as the “phase surface”. We also deﬁne the volume inside the energy shell: dq dp ¯ Ω(E) = Θ E − H(q, p) . (2.2.5) 3N h N! Clearly, the following relation holds: Ω (E) =

¯ dΩ(E) . dE

(2.2.6)

Quantum mechanically, the deﬁnition of the microcanonical ensemble for an isolated system with the Hamiltonian H and associated energy eigenvalues En is: p(En ) |n n| , (2.2.7) ρM C = n

where, analogously to (2.2.1), p(En ) =

1 Ω (E)∆

E ≤ En ≤ E + ∆

0

otherwise .

(2.2.8)

In the microcanonical density matrix ρM C , all the energy eigenstates |n whose energy En lies in the interval [E, E + ∆] contribute with equal weights. The normalization Tr ρM C = 1

(2.2.9a)

yields Ω (E) =

1 1, ∆ n

(2.2.9b)

where the summation is restricted to energy eigenstates within the energy shell. Thus Ω (E)∆ is equal to the number of energy eigenstates within the energy shell [E, E + ∆]. For the density matrix of the microcanonical ensemble, an abbreviated notation is also used: ρM C = Ω (E)−1 δ(H − E)

(2.2.7 )

30

2. Equilibrium Ensembles

and (2.2.9b )

Ω (E) = Tr δ(H − E) .

Equation (2.2.8) (and its classical analogue (2.2.1)) represent the fundamental hypothesis of equilibrium statistical mechanics. All the equilibrium properties of matter (whether isolated or in contact with its surroundings) can be deduced from them. The microcanonical density matrix describes an isolated system with given values of E, V , and N . The equilibrium density matrices corresponding to other typical physical situations, such as those of the canonical and the grand canonical ensembles, can be derived from it. As we shall see in the following examples, in fact essentially the whole volume within the hypersurface H(q, p) = E lies at its surface. More precisely, comparison of ¯ Ω(E) and Ω (E)∆ shows that E ¯ log Ω (E)∆ = log Ω(E) + O log . N∆ ¯ Since log Ω (E)∆ and log Ω(E) are both proportional to N , the remaining terms can be neglected for large N ; in this spirit, we can write ¯ Ω (E)∆ = Ω(E) .

2.2.2 The Classical Ideal Gas In this and in the next section, we present three simple examples for which Ω (E) can be calculated, and from which we can directly read oﬀ the characteristic dependences on the energy and the particle number. We shall now investigate the classical ideal gas, i.e. a classical system of N atoms between which there are no interactions at all; and we shall see from it how Ω (E) depends on the energy E and on the particle number N . Furthermore, we will make use of the results of this section later to derive the thermodynamics of the ideal gas. The Hamiltonian of the three-dimensional ideal gas is H=

N p2i + Vwall . 2m i=1

(2.2.10)

Here, the pi are the cartesian momenta of the particles and Vwall is the potential representing the wall of the container. The surface area of the energy shell is in this case N 1 p2i Ω (E) = 3N , d3 x1 . . . d3 xN d3 p1 . . . d3 pN δ E − h N! 2m i=1 V

V

(2.2.11)

2.2 Microcanonical Ensembles

31

where the integrations over x are restricted to the spatial volume V deﬁned by the walls. It would be straightforward to calculate Ω (E) directly. We shall ¯ carry out this calculation here via Ω(E), the volume inside the energy shell, which in this case is a hypersphere, in order to have both quantities available: 1 ¯ Ω(E) = 3N h N ! p2i /2m . (2.2.12) × d3 x1 . . . d3 xN d3 p1 . . . d3 pN Θ E − V

i

V

Introducing the surface area of the d-dimensional unit sphere,3 2π d/2 , (2π)d Kd ≡ dΩd = Γ (d/2)

(2.2.13)

we ﬁnd, representing the momenta in spherical polar coordinates, ¯ Ω(E) =

V

N

√

h3N N !

dΩ3N

2mE dp p3N −1 . 0

From this, we immediately obtain 3N

V N (2πmE) 2 ¯ , (2.2.14) Ω(E) = 3N h N !( 3N 2 )! 3N where Γ ( 3N 2 )= 2 − 1 ! was used, under the assumption – without loss of generality – of an even number of particles. For large N , Eq. (2.2.14) can be simpliﬁed by applying the Stirling formula (see problem 1.1). N ! ∼ N N e−N (2πN )1/2 ,

(2.2.15)

whereby it suﬃces to retain only the ﬁrst two factors, which dominate the expression. Then

¯ Ω(E) ≈

V N

N

4πmE 3h2 N

3N 2 e

5N 2

.

(2.2.16)

Making use of Eq. (2.2.6), we obtain from (2.2.14) and (2.2.16) the exact result for Ω (E): 3N −1 V N 2πm 2πmE 2 Ω (E) = h3N N ! 3N 2 −1 !

(2.2.17)

as well as an asymptotic expression which is valid in the limit of large N : 3

The derivation of (2.2.13) will be given at the end of this section.

32

2. Equilibrium Ensembles

Ω (E) ≈

N

V N

4πmE 3h2 N

3N 2 e

5N 2

1 3N . E 2

(2.2.18)

In (2.2.16) and (2.2.18), the speciﬁc volume V /N and the speciﬁc energy ¯ E/N occur to the power N . We now compare Ω(E), the volume inside the energy shell, with Ω (E)∆, the volume of a spherical shell of thickness ∆, by considering the logarithms of these two quantities (due to the occurrence of the N th powers): E ¯ log Ω (E)∆ = log Ω(E) + O log . (2.2.19) N∆ ¯ Since log Ω (E)∆ and log Ω(E) are both proportional to N , the remaining terms can be neglected in the case that N is large. In this approximation, we ﬁnd ¯ Ω (E)∆ ≈ Ω(E) ,

(2.2.20)

i.e. nearly the whole volume of the hypersphere H(q, p) ≤ E lies at its surface. This fact is due to the high dimensionality of the phase space, and it is to be expected that (2.2.20) remains valid even for systems with interactions. We now prove the expression (2.2.13) for the surface area of the d-dimensional unit sphere. To this end, we compute the d-dimensional Gaussian integral Z∞

Z∞ dp1 . . .

I= −∞

√ 2 2 dpd e−(p1 +···+pd ) = ( π)d .

(2.2.21)

−∞

This integral can also be written in spherical polar coordinates:4 Z Z Z ∞ Z Z 2 d 1 1 “d” I= dp pd−1 dΩd e−p = dt t 2 −1 e−t dΩd = Γ dΩd , (2.2.22) 2 2 2 0 where Z∞ Γ (z) =

dt tz−1 e−t

(2.2.23)

0

is the gamma function. Comparison of the two expressions (2.2.21) and (2.2.22) yields Z 2π d/2 . (2.2.13 ) dΩd = Γ (d/2)

In order to gain further insights into how the volume of the energy shell depends upon the parameters of the microcanonical ensemble, we will calculate 4

We denote an element of surface area on the R d-dimensional unit sphere by dΩd . For the calculation of the surface integral dΩd , it is not necessary to use the detailed expression for dΩd . The latter may be found in E. Madelung, Die Mathematischen Hilfsmittel des Physikers, Springer, Berlin, 7th edition (1964), p. 244.

2.2 Microcanonical Ensembles

33

Ω (E) for two other simple examples, this time quantum-mechanical systems; these are: (i) harmonic oscillators which are not coupled, and (ii) paramagnetic (not coupled) spins. Simple problems of this type can be solved for all ensembles with a variety of methods. Instead of the usual combinatorial method, we employ purely analytical techniques for the two examples which follow. ∗

2.2.3 Quantum-mechanical Harmonic Oscillators and Spin Systems ∗

2.2.3.1 Quantum-mechanical Harmonic Oscillators

We consider a system of N identical harmonic oscillators, which are either not coupled to each other at all, or else are so weakly coupled that their interactions may be neglected. Then the Hamiltonian for the system is given by: H=

N X

„ ω

a†j aj

j=1

1 + 2

« ,

(2.2.24)

where a†j (aj ) are creation (annihilation) operators for the jth oscillator. Thus we have Ω (E) =

∞ X n1 =0

=

∞ X

nN

···

n1 =0

“ X` 1 ´” δ E − ω nj + 2 =0 j

∞ X

···

` ´ Z ∞ Z N X dk ik E−Pj ω(nj + 12 ) dk ikE Y e−ikω/2 , = e e 2π 2π 1 − e−ikω n =0 i=1 N

(2.2.25) and ﬁnally Z Ω (E) =

dk N e 2π

`

ik(E/N)−log(2i sin(kω/2))

´ .

(2.2.26)

The computation of this integral can be carried out for large N using the saddlepoint method.5 The function ` ´ f (k) = ike − log 2i sin(kω/2) (2.2.27) with e = E/N has a maximum at the point k0 =

e+ 1 log ωi e−

ω 2 ω 2

.

(2.2.28)

This maximum can be determined by setting the ﬁrst derivative of (2.2.27) equal to zero 5

N.G. de Bruijn, Asymptotic Methods in Analysis, (North Holland, 1970); P. M. Morse and H. Feshbach, Methods of Theoretical Physics, p. 434, (McGraw Hill, New York, 1953).

34

2. Equilibrium Ensembles f (k0 ) = ie −

ω k0 ω cot =0. 2 2

Therefore, with ” “ p f (k0 ) = ik0 e − log 2i/ 1 − (2e/ω)2 «„ « « „„ e + ω ω . 1 e ω 2 2 e − (ω) + = log log e + ω 2 2 2 e − ω 2 and f (k0 ) =

(2.2.29)

` ω ´2 ‹

sin2 (k0 ω/2), we ﬁnd for Ω (E): Z 2 1 1 Nf (k0 ) e Ω (E) = dk eN 2 f (k0 )(k−k0 ) . 2π 2

(2.2.30)

√ The integral in this expression yields only a factor proportional to N ; thus, the number of states is given by –ﬀ j » e + 12 ω e + 12 ω e − 12 ω e − 12 ω . (2.2.31) Ω (E) = exp N log − log ω ω ω ω ∗

2.2.3.2 Two-level Systems: the Spin- 21 Paramagnet

As our third example, we consider a system of N particles which can occupy one of two states. The most important physical realization of such a system is a paramagnet in a magnetic ﬁeld H (h = −µB H), which has the Hamiltonian6 H = −h

N X

σi ,

with

σi = ±1.

(2.2.32)

i=1

The number of states of energy E is, from (2.2.1), given by X

Ω (E) =

{σi =±1}

Z =

N X ´ ` δ E+h σi =

Z

i=1

dk ikE e (2 cos kh)N = 2N 2π

dk 2π Z

X {σi =±1}

eik(E+h

P i

σi )

(2.2.33)

dk f (k) e 2π

with f (k) = ikE + N log cos kh .

(2.2.34)

The computation of the integral can again be accomplished by applying the saddlepoint method. Using f (k) = iE −N h tan kh and f (k) = −N h2 / cos2 kh, we obtain 6

In the literature of magnetism, it is usual to denote the magnetic ﬁeld by H or H. To distinguish it from the Hamiltonian in the case of magnetic phenomena, we use the symbol H for the latter.

2.3 Entropy

35

from the condition f (k0 ) = 0 k0 h = arctan

iE i 1 + E/N h = log . Nh 2 1 − E/N h

For the second derivative, we ﬁnd ´ ` f (k0 ) = − 1 − (E/N h)2 N h2 ≤ 0

for

− Nh ≤ E ≤ Nh .

Thus, using the abbreviation e = E/N h, we have «Z „ ` ´ dk − 12 −f (k0 ) (k−k0 )2 Ne 1+e 1 Ω (E) = 2N exp − log + N log √ e 2 1−e 2π 1 − e2 « „ N ` ´ 1+e N 1 2 1 Ne 2 2 log + log log (1 − e = √ exp − − )N h 2 1−e 2 1 − e2 2 2π n N 1+e N 1−e 1 − (1 − e) log − = √ exp − (1 + e) log 2 2 2 2 2π o 1 1 − log(1 − e2 ) − log N h2 , 2 2 » – ﬀ j 1+e N 1−e (1 + e) log + O(1, log N ) . Ω (E) = exp − + (1 − e) log 2 2 2 (2.2.35) We have now calculated the number of states Ω (E) for three examples. The physical consequences of the characteristic energy dependences will be discussed after we have introduced additional concepts such as those of entropy and temperature.

2.3 Entropy 2.3.1 General Deﬁnition Let an arbitrary density matrix ρ be given; then the entropy S is deﬁned by S = −k Tr (ρ log ρ) ≡ −klog ρ .

(2.3.1)

Here, we give the formulas only in their quantum-mechanical form, as we shall often do in this book. For classical statistics, the trace operation Tr is to be read as an integration over phase space. The physical meaning of S will become clear in the following sections. At this point, we can consider the entropy to be a measure of the size of the accessible part of phase space, and thus also of the uncertainty of the microscopic state of the system: the more states that occur in the density matrix, the greater the entropy S. For 1 example, for M states which occur with equal probabilities M , the entropy is given by M 1 1 S = −k log = k log M . M M 1

36

2. Equilibrium Ensembles

For a pure state, M = 1 and the entropy is therefore S = 0. In the diagonal representation of ρ (Eq. 1.4.8), one can immediately see that the entropy is positive semideﬁnite: S = −k Pn log Pn ≥ 0 (2.3.2) n

since x log x ≤ 0 in the interval 0 < x ≤ 1 (see Fig. 2.4). The factor k in (2.3.1) is at this stage completely arbitrary. Only later, by identifying the temperature scale with the absolute temperature, do we ﬁnd that it is then given by the Boltzmann constant k = 1.38 × 10−16 erg/K = 1.38 × 10−23J/K. See Sect. 3.4. The value of the Boltzmann constant was determined by Planck in 1900. The entropy is also a measure of the disorder and of the lack of information content in the density matrix. The more states contained in the density matrix, the smaller the weight of each individual state, and the less information about the system one has. Lower entropy means a higher information content. If for example a volume V is available, but the particles remain within a subvolume, then the entropy is smaller than if they occupied the whole of V . Correspondingly, the information content ( ∝ Tr ρ log ρ) of the density matrix is greater, since one knows that the particles are not anywhere within V , but rather only in the subvolume.

2.3.2 An Extremal Property of the Entropy Let two density matrices, ρ and ρ1 , be given. The important inequality Tr ρ(log ρ1 − log ρ) ≤ 0 . (2.3.3) then holds. To prove (2.3.3), we use the diagonal representations of ρ = P |n n| and ρ = P |ν ν|: n 1 1ν n ν Pn n| (log ρ1 − log Pn ) |n = Tr ρ(log ρ1 − log ρ) = n

ρ1 P1ν = Pn n| log |n = Pn n|ν ν| log |ν ν|n = P Pn n n n ν P ρ 1ν 1 ≤ Pn n|ν ν| − 1 |ν ν|n = Pn n| − 1 |n = Pn Pn n ν n = Tr ρ1 − Tr ρ = 0 . In an intermediate step, we used the basis |ν of ρ1 as well as the inequality log x ≤ x − 1. This inequality is clear from Fig. 2.4. Formally, it follows from properties of the function f (x) = log x − x + 1: f (1) = 0,

f (1) = 0,

f (x) = −

1 < 0 (i.e. f (x) is convex). x2

2.3 Entropy

37

Fig. 2.4. Illustrating the inequality log x ≤ x−1

2.3.3 Entropy of the Microcanonical Ensemble For the entropy of the microcanonical ensemble, we obtain by referring to (2.3.1) and (2.2.7) SM C = −k Tr ρM C log ρM C = −k Tr ρM C log

1 , Ω (E)∆

and, since the density matrix is normalized to 1, Eq. (2.2.9a), the ﬁnal result: SM C = k log Ω (E)∆ . (2.3.4) The entropy is thus proportional to the logarithm of the accessible phase space volume, or, quantum mechanically, to the logarithm of the number of accessible states. We shall now demonstrate an interesting extremal property of the entropy. Of all the ensembles whose energy lies in the interval [E, E + ∆], the entropy of the microcanonical ensemble is greatest. To prove this statement, we set ρ1 = ρM C in (2.3.3) and use the fact that ρ, like ρM C , diﬀers from zero only on the energy shell 1 = SM C . S[ρ] ≤ −k Tr ρ log ρM C = −k Tr ρ log (2.3.5) Ω (E)∆ Thus, we have demonstrated that the entropy is maximal for the microcanonical ensemble. We note also that for large N , the following representations of the entropy are all equivalent: ¯ SM C = k log Ω (E)∆ = k log Ω (E)E = k log Ω(E) .

(2.3.6)

This follows from the neglect of logarithmic terms in (2.2.19) and an analogous relation for Ω (E)E. We can now estimate the density of states. The spacing ∆E of the energy levels is given by ∆E =

∆ = ∆ · e−SMC /k ∼ ∆ · e−N . Ω (E)∆

(2.3.7)

38

2. Equilibrium Ensembles

The levels indeed lie enormously close together, i.e. at a high density, as already presumed in the Introduction. For this estimate, we used S = k log Ω (E)∆ ∝ N ; this can be seen from the classical results, (2.2.18) as well as (2.2.31) and (2.2.35).

2.4 Temperature and Pressure The results for the microcanonical ensemble obtained thus far permit us to calculate the mean values of arbitrary operators. These mean values depend on the natural parameters of the microcanonical ensemble, E, V , and N . The temperature and pressure have so far not made an appearance. In this section, we want to deﬁne these quantities in terms of the energy and volume derivatives of the entropy. 2.4.1 Systems in Contact: the Energy Distribution Function, Deﬁnition of the Temperature We now consider the following physical situation: let a system be divided into two subsystems, which interact with each other, i.e. exchange of energy between the two subsystems is possible. The overall system is isolated. The division into two subsystems 1 and 2 is not necessarily spatial. Let the Hamiltonian of the system be H = H1 + H2 + W . Let further the interaction W be small in comparison to H1 and H2 . For example, in the case of a spatial separation, the surface energy can be supposed to be small compared to the volume energy. The interaction is of fundamental importance, in that it allows the two subsystems to exchange energy. Let the overall system have the energy E, so that it is described by a microcanonical density matrix: ρM C = Ω1,2 (E)−1 δ(H1 + H2 + W − E) ≈ Ω1,2 (E)−1 δ(H1 + H2 − E) . (2.4.1) Here, W was neglected relative to H1 and H2 , and Ω1,2 (E) is the phase-space surface of the overall system with a dividing wall (see remarks at the end of this section).

Fig. 2.5. An isolated system divided into subsystems 1 and 2 separated by a ﬁxed diathermal wall (which permits the exchange of thermal energy)

2.4 Temperature and Pressure

39

ω (E1 ) denotes the probability density for subsystem 1 to have the energy E1 . According to Eq. (1.2.10), ω (E1 ) is given by ω (E1 ) = δ(H1 − E1 ) = dΓ1 dΓ2 Ω1,2 (E)−1 δ(H1 + H2 − E)δ(H1 − E1 ) =

Ω2 (E − E1 )Ω1 (E1 ) . (2.4.2a) Ω1,2 (E)

Here, (2.4.1) was used and we have introduced the phase-space surfaces of subsystem 1, Ω1 (E1 ) = dΓ1 δ(H1 − E1 ), and subsystem 2, Ω2 (E − E1 ) = dΓ2 δ(H2 − E + E1 ). The most probable value of E1 , denoted as E˜1 , can be (E1 ) found from dωdE = 0: 1 −Ω2 (E − E1 )Ω1 (E1 ) + Ω2 (E − E1 )Ω1 (E1 ) = 0 . E˜1

Using formula (2.3.4) for the microcanonical entropy, we obtain ∂ ∂ S2 (E2 ) = S1 (E1 ) . ˜ ∂E2 ∂E1 E−E1 E˜1

(2.4.3)

We now introduce the following deﬁnition of the temperature: ∂ S(E) . ∂E Then it follows from (2.4.3) that T −1 =

(2.4.4)

T1 = T2 .

(2.4.5)

In the most probable conﬁguration, the temperatures of the two subsystems are equal. We are already using partial derivatives here, since later, several variables will occur. For the ideal gas, we can see immediately that the temperature increases proportionally to the energy per particle, T ∝ E/N . This property, as well as (2.4.5), the equality of the temperatures of two systems which are in contact and in equilibrium, correspond to the usual concept of temperature. Remarks: The Hamiltonian has a lower bound and possesses a ﬁnite smallest eigenvalue E0 . In general, the Hamiltonian does not have an upper bound, and the density of the energy eigenvalues increases with increasing energy. As a result, the temperature cannot in general be negative, (T ≥ 0), and it increases with increasing energy. For spin systems there is also an upper limit to the energy. The density of states then again decreases as the upper limit is approached, so that in this energy range, Ω /Ω < 0 holds. Thus in such systems there can be states with a negative absolute temperature (see Sect. 6.7.2). Due to the various possibilities for representing the entropy as given in (2.3.6), the d −1 ¯ temperature can also be written as T = k dE log Ω(E) .

40

2. Equilibrium Ensembles

Notes concerning Ω1,2 (E) in Eq. (2.4.1); may be skipped over in a ﬁrst reading: (i) In (2.4.1 and 2.4.2a), it must be taken into account that subsystems 1 and 2 are separated from each other. The normalization factor Ω1,2 (E) which occurs in (2.4.1) and (2.4.2a) is not given by Z Z dq dp δ(H − E) ≡ Ω(E) , dΓ δ(H − E) ≡ h3N N ! but instead by Z

Z dq1 dp1 dq2 dp2 dΓ1 dΓ2 δ(H − E) ≡ δ(H − E) N1 !h3N1 N2 !h3N2 Z Z = dE1 dΓ1 dΓ2 δ(H − E)δ(H1 − E1 ) Z Z = dE1 dΓ1 dΓ2 δ(H2 − E + E1 )δ(H1 − E1 ) Z = dE1 Ω1 (E1 )Ω2 (E − E1 ) .

Ω1,2 (E) =

(2.4.2b)

(ii) Quantum mechanically, one obtains the same result for (2.4.2a): “ ” 1 ω (E1 ) = δ(H1 − E1 ) ≡ Tr δ(H1 + H2 − E)δ(H1 − E1 ) Ω1,2 (E) ” “ ´ ` 1 = Tr 1 Tr 2 δ H2 − (E − E1 ) δ(H1 − E1 ) Ω1,2 (E) Ω1 (E1 )Ω2 (E − E1 ) = Ω1,2 (E) and Z ` ´ Ω1,2 (E) = Tr δ(H1 + H2 − E) ≡ dE1 Tr δ(H1 + H2 − E)δ(H1 − E1 ) Z Z ` ´ = dE1 Tr δ(H2 − E + E1 )δ(H1 − E1 ) = dE1 Ω1 (E1 )Ω2 (E − E1 ) . Here, we have used the fact that for the non-overlapping subsystems 1 and 2, the traces Tr 1 and Tr 2 taken over parts 1 and 2 are independent, and the states must be symmetrized (or antisymmetrized) only within the subsystems. (iii) We recall that for quantum-mechanical particles which are in non-overlapping states (wavefunctions), the symmetrization (or antisymmetrization) has no eﬀect on expectation values, and that therefore, in this situation, the symmetrization does not need to be carried out at all.7 More precisely: if one considers the matrix elements of operators which act only on subsystem 1, their values are the same independently of whether one takes the existence of subsystem 2 into account, or bases the calculation on the (anti-)symmetrized state of the overall system.

7

See e.g. G. Baym, Lectures on Quantum Mechanics (W.A. Benjamin, New York, Amsterdam 1969), p. 393

2.4 Temperature and Pressure

41

2.4.2 On the Widths of the Distribution Functions of Macroscopic Quantities 2.4.2.1 The Ideal Gas For the ideal gas, from (2.2.18) one ﬁnds the following expression for the probability density of the energy E1 , Eq. (2.4.2a): ω (E1 ) ∝ (E1 /N1 )3N1 /2 (E2 /N2 )3N2 /2 .

(2.4.6)

In equilibrium, from the equality of the temperatures [Eq. (2.4.3)], i.e. from ˜1 ) ˜2 ) ∂S(E ∂S(E N1 N2 ˜ = E−E ˜ and thus ∂E1 = ∂E2 , we obtain the condition E 1

˜1 = E E

1

N1 . N1 + N2

(2.4.7)

If we expand the distribution function ω (E1 ) around the most probable en(E1 ) ergy value E˜1 , using dωdE |E˜1 = 0 and terminating the expansion after the 1 quadratic term, we ﬁnd

˜ 1 ) + 1 − 3 N1 − 3 N2 ˜1 2 , log ω (E1 ) = log ω(E E1 − E 2 2 ˜ ˜ 2 2 E1 2 E2 and therefore 3 N1 +N2 ˜ E ˜ E 1 2

˜ 1 ) e− 4 ω (E1 ) = ω(E where

N1 ˜2 E 1

+

N2 ˜2 E

=

2

N2 ˜1 E ˜2 E

+

˜ 1 )2 (E1 −E

N1 ˜1 E ˜2 E

=

˜1 ) e− 4 N1 N2 e¯2 (E1 −E1 ) , = ω(E

N ˜1 E ˜2 E

3

N

˜

2

(2.4.8)

and e¯ = E/N were used. Here,

log ω (E1 ) rather than ω (E1 ) was expanded, because of the occurrence of the powers of the particle numbers N1 and N2 in Eq. (2.4.6). This is also preferable since it permits the coeﬃcients of the Taylor expansion to be expressed in terms of derivatives of the entropy. From (2.4.8), we obtain the relative mean square deviation: ˜1 ) 2 ˜2 ˜1 E (E1 − E 2 1 N2 1 2 E = = 2 ≈ 10−20 (2.4.9) 2 ˜ ˜ 3 N N1 E1 E1 3 (N1 + N2 ) and the relative width of the distribution, with N2 ≈ N1 , ∆E1 1 ∼ √ . N E˜1

(2.4.10)

For macroscopic systems, the distribution is very sharp. The most probable state occurs with a stupendously high probability. The sharpness of the distribution function becomes even more apparent if one expresses it in terms of the energy per particle, e1 = E1 /N1 , including the normalization factor: 3N N1 3 N N1 4N (e1 −˜ e1 )2 ωe1 (e1 ) = e¯e 2 e¯2 . 4π N2

42

2. Equilibrium Ensembles

2.4.2.2 A General Interacting System For interacting systems it holds quite generally that: An arbitrary quantity A, which can be written as a volume integral over a density A(x), A = d3 x A(x) . (2.4.11) V

Its average value depends on the volume as d3 xA(x) ∼ V .

A =

(2.4.12)

V

The mean square deviation is given by 2

A − A A − A = d3 x d3 x A(x) − A(x) A(x ) − A(x ) ∝ V l3 .

(∆A) =

V

V

(2.4.13) Both the integrals in (2.4.13) are to be taken over the volume V . The correlation function in the integral however vanishes for |x − x | > l, where l is the range of the interactions (the correlation length). The latter is ﬁnite and thus the mean square deviation is likewise only of the order of V and not, as one might perhaps naively expect, quadratic in V . The relative deviation of A is therefore given by ∆A 1 ∼ 1/2 . A V

(2.4.14)

2.4.3 External Parameters: Pressure Let the Hamiltonian of a system depend upon an external parameter a: H = H(a). This external parameter can for example be the volume V of ¯ we can derive an expression the system. Using the volume in phase space, Ω, for the total diﬀerential of the entropy dS. Starting from the phase-space volume ¯ (E, a) = dΓ Θ E − H(a) , Ω (2.4.15) we take its total diﬀerential

2.4 Temperature and Pressure

¯ (E, a) = dΩ

43

∂H da dΓ δ E − H(a) dE − ∂a ∂H da , (2.4.16) = Ω (E, a) dE − ∂a

or ¯ = Ω dE − ∂H da . d log Ω ¯ ∂a Ω ¯ (E, a) and (2.4.4), obtaining We now insert S(E, a) = k log Ω ∂H 1 dE − da . dS = T ∂a

(2.4.17)

(2.4.18)

From (2.4.18), we can read oﬀ the partial derivatives of the entropy in terms of E and a:8

∂S ∂S 1 1 ∂H ; . (2.4.19) = =− ∂E a T ∂a E T ∂a Introduction of the pressure (special case: a = V ): After the preceding considerations, we can turn to the derivation of pressure within the framework of statistical mechanics. We refer to Fig. 2.6 as a guide to this procedure. A movable piston at a distance L from the origin of the coordinate system permits variations in the volume V = LA, where A is the cross-sectional area of the piston. The inﬂuence of the walls of the container is represented by a wall potential. Let the spatial coordinate of the ith particle in the direction perpendicular to the piston be xi . Then the total wall potential is given by Vwall =

N

v(xi − L) .

(2.4.20)

i=1

Fig. 2.6. The deﬁnition of pressure

Here, v(xi − L) is equal to zero for xi < L and is very large for xi ≥ L, so that penetration of the wall by the gas particles is prevented. We then obtain for the force on the molecules 8

` ∂S ´ The symbol ∂E denotes the partial derivative of S with respect to the energy a E, holding a constant, etc.

44

2. Equilibrium Ensembles

F =

Fi =

i

∂v ∂ ∂H . − = v(xi − L) = ∂xi ∂L i ∂L i

(2.4.21)

The pressure is deﬁned as the average force per unit area which the molecules exert upon the wall, from which we ﬁnd using (2.4.21) that F ∂H P ≡− =− (2.4.22) A ∂V In this case, the general relations (2.4.18) and (2.4.19) become dS =

1 (dE + P dV ) T

(2.4.23)

and 1 = T

∂S ∂E

, V

P = T

∂S ∂V

.

(2.4.24)

E

Solving (2.4.23) for dE, we obtain dE = T dS − P dV ,

(2.4.25)

a relation which we will later identify as the First Law of Thermodynamics [for a constant particle number; see Eqs. (3.1.3) and (3.1.3 )]. Comparison with phenomenological thermodynamics gives an additional justiﬁcation for the identiﬁcation of T with the temperature. As a result of −P dV =

F dV = F dL ≡ δW , A

the last term in (2.4.25) denotes the work δW which is performed on the system causing the change in volume. We are now interested in the pressure distribution in two subsystems, which are separated from each other by a movable partition, keeping the particle numbers in each subsystem constant (Fig. 2.6 ). The energies and volumes are additive E = E1 + E2 ,

V = V1 + V2 .

(2.4.26)

The probability that subsystem 1 has the energy E1 and the volume V1 is given by δ(H1 + H2 − E) δ(H1 − E1 )Θ(q1 ∈ V1 )Θ(q2 ∈ V2 ) ω (E1 , V1 ) = dΓ1 dΓ2 Ω1,2 (E, V ) =

Ω1 (E1 , V1 )Ω2 (E2 , V2 ) . Ω1,2 (E, V )

(2.4.27a)

2.4 Temperature and Pressure

45

Fig. 2.6 . Two systems which are isolated from the external environment, separated by a movable wall which permits the exchange of energy.

In (2.4.27a), the function Θ(q1 ∈ V1 ) means that all the spatial coordinates of the sub-phase space 1 are limited to the volume V1 and correspondingly, Θ(q2 ∈ V2 ). Here, both E1 and V1 are statistical variables, while in (2.4.2b), V1 was a ﬁxed parameter. Therefore, the normalization factor is given here by Ω1,2 (E, V ) = dE1 dV1 Ω1 (E1 , V1 )Ω2 (E − E1 , V − V1 ) . (2.4.27b) In analogy to (2.4.3), the most probable state of the two systems is found by the condition of vanishing derivatives of (2.4.27a) ∂ω (E1 , V1 ) =0 ∂E1

and

∂ω (E1 , V1 ) =0. ∂V1

From this, it follows that ∂ ∂ log Ω1 (E1 , V1 ) = log Ω2 (E2 , V2 ) ⇒ T1 = T2 ∂E1 ∂E2 and

(2.4.28) ∂ ∂ log Ω1 (E1 , V1 ) = log Ω2 (E2 , V2 ) ⇒ P1 = P2 . ∂V1 ∂V2

In systems which are separated by a movable wall and can exchange energy, the equilibrium temperatures and pressures are equal. The microcanonical density matrix evidently depends on the energy E and on the volume V , as well as on the particle number N . If we regard these parameters likewise as variables, then the overall variation of S must be replaced by 1 P µ dE + dV − dN . T T T Here, we have deﬁned the chemical potential µ by dS =

(2.4.29)

∂ µ =k log Ω (E, V, N ) . (2.4.30) T ∂N The chemical potential is related to the fractional change in the number of accessible states with respect to the change in the number of particles. Physically, its meaning is the change in energy per particle added to the system, as can be seen from (2.4.29) by solving that expression for dE.

46

2. Equilibrium Ensembles

2.5 Thermodynamic Properties of Some Non-interacting Systems Now that we have introduced the thermodynamic concepts of temperature and pressure, we are in a position to discuss further the examples of a classical ideal gas, quantum-mechanical oscillators, and non-interacting spins treated in Sect. 2.2.2. In the following, we will derive the thermodynamic consequences of the phase-space surface or number of states Ω (E) which we calculated there for those examples. 2.5.1 The Ideal Gas We ﬁrst calculate the thermodynamic quantities introduced in the preceding sections for the case of an ideal gas. In (2.2.16), we found the phase-space volume in the limit of a large number of particles: ¯ (E) ≡ Ω

dΓ Θ E − H(q, p) =

V N

N

4πmE 3N h2

3N 2 e

5N 2

.

(2.2.16)

If we insert (2.2.16) into (2.3.6), we obtain the entropy as a function of the energy and the volume:

3 V 4πmE 2 5 (2.5.1) S(E, V ) = kN log e2 . N 3N h2 Eq. (2.5.1) is called the Sackur–Tetrode equation. It represents the starting point for the calculation of the temperature and the pressure. The temperature is, from (2.4.4), deﬁned of the partial energy derivative of the reciprocal ∂S as 3 −1 the entropy, T −1 = ∂E = kN E , from which the caloric equation of 2 V state of the ideal gas follows immediately: E=

3 N kT . 2

(2.5.2)

With (2.5.2), we can also ﬁnd the entropy (2.5.1) as a function of T and V :

3 V 2πmkT 2 5 S(T, V ) = kN log (2.5.3) e2 . N h2 The pressure is obtained from (2.4.24) by taking the volume derivative of (2.5.1)

∂S kT N P =T . (2.5.4) = ∂V E V

2.5 Properties of Some Non-interacting Systems

47

This is the thermal equation of state of the ideal gas, which is often written in the form P V = N kT .

(2.5.4 )

The implications of the thermal equation of state are summarized in the diagrams of Fig. 2.7: Fig. 2.7a shows the P V T surface or surface of the equation of state, i.e. the pressure as a function of V and T . Figs. 2.7b,c,d are projections onto the P V -, the T V - and the P T -planes. In these diagrams, the isotherms (T = const), the isobars (P = const), and the isochores (V = const) are illustrated. These curves are also drawn in on the P V T surface (Fig. 2.7a). Remarks: (i) It can be seen from (2.5.2) that the temperature increases with the energy content of the ideal gas, in accord with the usual concept of temperature. (ii) The equation of state (2.5.4) also provides us with the possibility of measuring the temperature. The determination of the temperature of an ideal gas can be achieved by measuring its volume and its pressure.

Fig. 2.7. The equation of state of the ideal gas: (a) surface of the equation of state, (b) P -V diagram, (c) T -V diagram, (d) P -T diagram

48

2. Equilibrium Ensembles

The temperature of any given body can be determined by bringing it into thermal contact with an ideal gas and making use of the fact that the two temperatures will equalize [Eq. (2.4.5)]. The relative sizes of the two systems (body and thermometer) must of course be chosen so that contact with the ideal gas changes the temperature of the body being investigated by only a negligible amount. ∗

2.5.2 Non-interacting Quantum Mechanical Harmonic Oscillators and Spins 2.5.2.1 Harmonic Oscillators From (2.2.31) and (2.3.6), it follows for the entropy of non-coupled harmonic oscillators with e = E/N , that – » e + 12 ω e + 12 ω e − 12 ω e − 12 ω , (2.5.5) S(E) = kN log − log ω ω ω ω where a logarithmic term has been neglected. From Eq. (2.4.4), we obtain for the temperature „ T =

∂S ∂E

«−1 =

ω k

From this, it follows via

„ log

e + 12 ω e − 12 ω

E+ 1 Nω 2

E− 1 Nω 2

«−1 .

(2.5.6)

ω

= e kT that the energy as a function of the

temperature is given by ﬀ j 1 1 + . E = N ω 2 eω/kT − 1

(2.5.7)

The energy increases monotonically with the temperature (Fig. 2.8). Limiting cases: For E → N ω (the minimal energy), we ﬁnd 2 T →

1 =0, log ∞

(2.5.8a)

and for E → ∞ T →

1 =∞. log 1

We can also see that for T → 0, the heat capacity tends to zero: CV = this is in agreement with the Third Law of Thermodynamics.

(2.5.8b) ` ∂E ´ ∂T

V

→ 0;

2.5.2.2 A Paramagnetic Spin- 21 System Finally, we consider a system of N magnetic moments with spin 12 which do not interact with each other; or, more generally, a system of non-interacting two-level systems. We refer here to Sect. 2.2.3.2. From (2.2.35), the entropy of such a system is given by

2.5 Properties of Some Non-interacting Systems

49

Fig. 2.8. Non-coupled harmonic oscillators: the energy as a function of the temperature.

S(E) =

kN 2

j −(1 + e) log

1+e 1−e − (1 − e) log 2 2

ﬀ

with e = E/N h. From this, we ﬁnd for the temperature: „ «−1 „ «−1 ∂S 2h 1−e T = = . log ∂E k 1+e

(2.5.9)

(2.5.10)

The entropy is shown as a function of the energy in Fig. 2.9, and the temperature as a function of the energy in Fig. 2.10. The ground-state energy is E0 = −N h. For E → −N h, we ﬁnd from (2.5.10) lim

E→−Nh

T =0.

(2.5.11)

The temperature increases with increasing energy beginning at E0 = −N h monotonically until E = 0 is reached; this is the state in which the magnetic moments are completely disordered, i.e. there are just as many oriented parallel as antiparallel to the applied magnetic ﬁeld h. The region E > 0, in which the temperature is negative (!), will be discussed later in Sect. 6.7.2.

Fig. 2.9. The entropy as a function of the energy for a two-level system (spin− 12 −paramagnet)

Fig. 2.10. The temperature as a function of the energy for a two-level system (spin− 12 −paramagnet)

50

2. Equilibrium Ensembles

2.6 The Canonical Ensemble In this section, the properties of a small subsystem 1 which is embedded in a large system 2, the heat bath,9 will be investigated (Fig. 2.11). We ﬁrst need to construct the density matrix, which we will derive from quantum mechanics in the following section. The overall system is taken to be isolated, so that it is described by a microcanonical ensemble.

Fig. 2.11. A canonical ensemble. Subsystem 1 is in contact with the heat bath 2. The overall system is isolated.

2.6.1 The Density Matrix The Hamiltonian of the total system H = H1 + H2 + W ≈ H1 + H2

(2.6.1)

is the sum of the Hamiltonians H1 and H2 for systems 1 and 2 and the interaction term W . The latter is in fact necessary so that the two subsystems can come to equilibrium with each other; however, W is negligibly small compared to H1 and H2 . Our goal is the derivation of the density matrix for subsystem 1 alone. We will give two derivations here, of which the second is shorter, but the ﬁrst is more useful for the introduction of the grand canonical ensemble in the next section. (i) Let PE1n be the probability that subsystem 1 is in state n with an energy eigenvalue E1n . Then for PE1n , using the microcanonical distribution for the total system, we ﬁnd PE1n =

1 Ω2 (E − E1n ) = . Ω1,2 (E)∆ Ω1,2 (E)

(2.6.2)

The sum runs over all the states of subsystem 2 whose energy E2n lies in the interval E − E1n ≤ E2n ≤ E + ∆ − E1n . In the case that subsystem 1 is very much smaller than subsystem 2, we can expand the logarithm of Ω2 (E −E1n ) in E1n : 9

A heat bath (or thermal reservoir) is a system which is so large that adding or subtracting a ﬁnite amount of energy to it does not change its temperature.

2.6 The Canonical Ensemble

˜1 + E ˜1 − E1n ) Ω2 (E − E Ω1,2 (E) ˜1 ) ˜ Ω2 (E − E e(E1 −E1n )/kT = Z −1 e−E1n /kT . ≈ Ω1,2 (E)

51

PE1n =

(2.6.3)

−1 ∂ This expression contains T = k ∂E log Ω2 (E − E˜1 ) , the temperature of the heat bath. The normalization factor Z, from (2.6.3), is given by Z=

Ω1,2 (E) −E˜1 /kT . e ˜2 ) Ω2 (E

(2.6.4)

However, it is important that Z can be calculated directly from the properties of subsystem 1. The condition that the sum over all the PE1n must be equal to 1 implies that Z= e−E1n /kT = Tr 1 e−H1 /kT . (2.6.5) n

Z is termed the partition function. The canonical density matrix is then given by the following equivalent representations ρC = PE1n |n n| = Z −1 e−E1n /kT |n n| = Z −1 e−H1 /kT . (2.6.6) n

n

(ii) The second derivation starts with the fact that the density matrix ρ for subsystem 1 can be obtained form the microcanonical density matrix by taking the trace over the degrees of freedom of system 2: δ(H1 + H2 − E) Ω2 (E − H1 ) = Ω1,2 (E) Ω1,2 (E) ˜1 ) ˜ Ω2 (E − E˜1 + E˜1 − H1 ) Ω2 (E − E ≡ ≈ e(E1 −H1 )/kT . Ω1,2 (E) Ω1,2 (E)

ρC = Tr 2 ρM C = Tr 2

(2.6.7)

This derivation is valid both in classical physics and in quantum mechanics, as is shown speciﬁcally in (2.6.9). Thus we have also demonstrated the validity of (2.6.6) with the deﬁnition (2.6.5) by this second route. Expectation values of observables A which act only on the states of subsystem 1 are given by A = Tr 1 Tr 2 ρM C A = Tr 1 ρC A .

(2.6.8)

Remarks: (i) The classical distribution function: The classical distribution function of subsystem 1 is obtained by integration of ρM C over Γ2

52

2. Equilibrium Ensembles

ρC (q1 , p1 ) =

dΓ2 ρM C

1 δ E − H1 (q1 , p1 ) − H2 (q2 , p2 ) dΓ2 Ω1,2 (E) Ω2 E − H1 (q1 , p1 ) . = Ω1,2 (E) =

(2.6.9)

If we expand the logarithm of this expression with respect to H1 , we obtain ρC (q1 , p1 ) = Z −1 e−H1 (q1 ,p1 )/kT Z = dΓ1 e−H1 (q1 ,p1 )/kT .

(2.6.10a) (2.6.10b)

Here, Z is called the partition function. Mean values of observables A(q1 , p1 ) which refer only to subsystem 1 are calculated in the classical case by means of A = dΓ1 ρC (q1 , p1 )A(q1 , p1 ) , (2.6.10c) as one ﬁnds analogously to (2.6.8). (ii) The energy distribution: The energy distribution ω (E1 ) introduced in Sect. 2.4.1 can also be calculated classically and quantum mechanically within the framework of the canonical ensemble (see problem 2.7): 1 ω (E1 ) = ∆1

E1 Z+∆1

dE1

E1

X

δ(E1 − E1n )PE1n

n

Ω2 (E − E1 ) 1 X Ω2 (E − E1 )Ω1 (E1 ) ≈ 1= . Ω1,2 (E) ∆1 n Ω1,2 (E)

(2.6.11)

This expression agrees with (2.4.2a). (iii) The partition function (2.6.5) can also be written as follows: Z Z Z = dE1 Tr 1 e−H1 /kT δ(H1 − E1 ) = dE1 Tr 1 e−E1 /kT δ(H1 − E1 ) Z (2.6.12) = dE1 e−E1 /kT Ω1 (E1 ) .

(iv) In the derivation of the canonical density matrix, Eq. (2.6.7), we expanded the logarithm of Ω2 (E − H1 ).We show that it was justiﬁed to terminate this expansion after the ﬁrst term of the Taylor series: ˜1 − (H1 − E ˜1 )) Ω2 (E − H1 ) = Ω2 (E − E ˜1 )e = Ω2 (E − E ˜1 )e = Ω2 (E − E

1 ˜1 )+ 1 − kT (H1 −E 2

“

∂1/T ˜ ∂E 2

1 ˜1 )− 1 − kT (H1 −E 2kT 2

” ˜ 1 )2 +... (H1 −E

∂T ˜ ∂E 2

˜1 )2 +... (H1 −E

1 ˜1 )(1+ 1 (H1 −E ˜ 1 )+...) (H1 −E ˜1 )e− kT 2T C = Ω2 (E − E ,

2.6 The Canonical Ensemble

53

where C is the heat capacity of the thermal bath. Since, owing to the large size ˜1 ) T C holds (to be regarded as an inequality of the thermal bath, (H1 − E for the eigenvalues), it is in fact justiﬁed to ignore the higher-order corrections in the Taylor expansion. (v) In later sections, we will be interested only in the (canonical) subsystem 1. The heat bath 2 enters merely through its temperature. We shall then leave oﬀ the index ‘1’ from the relations derived in this section. 2.6.2 Examples: the Maxwell Distribution and the Barometric Pressure Formula Suppose the subsystem to consist of one particle. The probability that its position and its momentum take on the values x and p is given by: p2 w(x, p) d3 x d3 p = C e−β 2m +V (x) d3 x d3 p . (2.6.13) 1 Here, β = kT and V (x) refers to the potential energy, while C = C C is a normalization factor10 . Integration over spatial coordinates gives the momentum distribution p2

w(p) d3 p = C e−β 2m d3 p .

(2.6.14)

If we do not require the direction of the momentum, i.e. integrating over all angles, we obtain p2

w(p) dp = 4πC e−β 2m p2 dp ;

(2.6.15)

this is the Maxwell velocity distribution. Integration of (2.6.13) over the momentum gives the spatial distribution: w(x) d3 x = C e−βV (x) d3 x .

(2.6.16)

If we now set the potential V (x) equal to the gravitational ﬁeld V (x) = mgz and use the fact that the particle-number density is proportional to w(x), we obtain [employing the equation of state for the ideal gas, (2.5.4 ), which relates the pressure to the particle-number density] an expression for the altitude dependence of the pressure, the barometric pressure formula: P (z) = P0 e−mgz/kT

(2.6.17)

(cf. also problem 2.15). 10

C =

`

´3/2 β 2πm

and C =

“R

d3 x e−βV (x)

”−1

54

2. Equilibrium Ensembles

2.6.3 The Entropy of the Canonical Ensemble and Its Extremal Values From Eq. (2.6.6), we ﬁnd for the entropy of the canonical ensemble SC = −klog ρC =

1 ¯ E + k log Z T

(2.6.18)

with ¯ = H . E

(2.6.18 )

Now let ρ correspond to a diﬀerent distribution with the same average energy ¯ then the inequality H = E; S[ρ] = −k Tr (ρ log ρ) ≤ −k Tr ρ log ρC H (2.6.19) 1 − log Z = H + k log Z = SC = −k Tr ρ − kT T results. Here, the inequality in (2.3.3) was used along with ρ1 = ρC . The canonical ensemble has the greatest entropy of all ensembles with the same average energy. 2.6.4 The Virial Theorem and the Equipartition Theorem 2.6.4.1 The Classical Virial Theorem and the Equipartition Theorem Now, we consider a classical system and combine its momenta and spatial ∂H coordinates into xi = pi , qi . For the average value of the quantity xi ∂x we j ﬁnd the following relation: ∂H ∂H −H/kT −1 =Z dΓ xi e xi ∂xj ∂xj ∂e−H/kT = Z −1 dΓ xi (−kT ) = kT δij , (2.6.20) ∂xj where we have carried out an integration by parts. We have assumed that exp(−H(p, q)/kT ) drops oﬀ rapidly enough for large p and q so that no boundary terms occur. This is the case for the kinetic energy and potentials such as those of harmonic oscillators. In the general case, one would have to take the wall potential into account. Eq. (2.6.20) contains the classical virial theorem as a special case, as well as the equipartition theorem. Applying (2.6.20) to the spatial coordinates qi , we obtain the classical virial theorem ∂V qi = kT δij . (2.6.21) ∂qj

2.6 The Canonical Ensemble

55

We now specialize to the case of harmonic oscillators, i.e. V =

Vi ≡

i

mω 2 2

i

qi2 .

(2.6.22)

For this case, it follows from (2.6.21) that Vi =

kT . 2

(2.6.23)

The potential energy of each degree of freedom has the average value kT /2. Applying (2.6.20) to the momenta, we ﬁnd the equipartition theorem. We take as the kinetic energy the generalized quadratic form Ekin = aik pi pk , with aik = aki . (2.6.24) i,k

kin For this form, we ﬁnd ∂E k (aik pk +aki pk ) = k 2aik pk and therewith, ∂pi = after multiplication by pi and summation over all i,

pi

i

∂Ekin = 2aik pi pk = 2Ekin . ∂pi

(2.6.25)

k

Now we take the thermal average and ﬁnd from (2.6.20) ∂H = 2Ekin = 3 N kT ; pi ∂pi i

(2.6.26)

i.e. the equipartition theorem. The average kinetic energy per degree of freedom is equal to 12 kT . As previously mentioned, in the potential V , the interaction 1 m,n v(|xmn |) (with xmn = xm − xn ) of the particles with each other and 2 in general their interaction with the wall, Vwall , must be taken into account. Then using (2.6.23) and (2.6.25), we ﬁnd 2 1 ∂v(|xmn |) xmn . (2.6.27) P V = Ekin − 3 6 m,n ∂xmn The term P V results from the wall potential. The second term on the righthand side is called the ‘virial’ and can be expanded in powers of N V (virial expansion, see Sect. 5.3). ∗

Proof of (2.6.27): We begin with the Hamiltonian H=

X p2n 1X v(xn − xm ) + Vwall , + 2m 2 n,m n

(2.6.28)

56

2. Equilibrium Ensembles

Fig. 2.12. Quantities related to the wall potential and the pressure: increasing the volume on displacing a wall by δL1

and write for the pressure, using (2.4.22): PV = −

D ∂H E V D ∂H E ∂H ∂H E 1 D ∂H V =− . (2.6.29) = − L1 + L2 + L3 ∂V ∂L1 L2 L3 3 ∂L1 ∂L2 ∂L3

Now, Vwall has the form (cf. Fig. 2.12) X˘ ¯ Θ(xi1 − L1 ) + Θ(xi2 − L2 ) + Θ(xi3 − L3 ) . Vwall = V∞

(2.6.30)

i

energy of Here, V∞ characterizes the barrier represented by the wall. The kinetic P wall the particles is much smaller than V∞ . Evidently, ∂V = −V∞ n δ(xn1 − L1 ) ∂L1 and therefore DX E DX E ∂Vwall E D X = xn1 xn1 V∞ δ(xn1 − L1 ) = L1 V∞ δ(xn1 − L1 ) ∂xn1 n n n E D ∂H E D ∂V wall = − L1 . = − L1 ∂L1 ∂L1 With this, (2.6.29) can be put into the form E E 1D X ∂ ∂ 1DX xnα Vwall = kT N − xnα v 3 n,α ∂xnα 3 n,α ∂xnα D E X X ¸ ˙ 1 2 ∂v = (xnα − xmα ) Ekin − . 3 6 α ∂(xnα − xmα )

PV =

(2.6.31) (2.6.32)

n=m

In the ﬁrst line, the virial theorem (2.6.21) was used, and we have abbreviated the sum of the pair potentials as v. In the second line, kT was substituted by (2.6.26) and the derivative of the pair potentials was written out explicitly, whereby for example « „ ∂v(x1 − x2 ) ∂ ∂ + x2 v(x1 − x2 ) = (x1 − x2 ) x1 ∂x1 ∂x2 ∂(x1 − x2 ) was used, and x1 (x2 ) refers to the x component of particle 1(2). With (2.6.32), we have proven (2.6.27).

2.6 The Canonical Ensemble ∗

57

2.6.4.2 The Quantum-Statistical Virial Theorem

Starting from the Hamiltonian H=

p2 1 n + V (xn − xwall ) + v(xn − xm ) , 2m 2 n,m n n

(2.6.33)

it follows that11

[H, xn · pn ] = −i

p2n − xn · ∇n V (xn − xwall ) m xn · ∇n v(xn − xm ) . (2.6.34) − n =m

Now, ψ| [H, n xn · pn ] |ψ = 0 for energy eigenstates. We assume the density matrix to be diagonal in the basis of the energy eigenstates; from this, it follows that 2 Ekin − xn · ∇n V (xn − xwall ) n

−

xn · ∇n v(xn − xm ) = 0 . (2.6.35)

n m =n

With (2.6.31), we again obtain the virial theorem immediately 1 2 Ekin − 3P V − (xn − xm ) · ∇v(xn − xm ) = 0 . 2 n m

(2.6.27)

Eq. (2.6.27) is called the virial theorem of quantum statistics. It holds both classically and quantum mechanically, while (2.6.21) and (2.6.26) are valid only classically. From the virial theorem (2.6.27), we ﬁnd for ideal gases: PV =

2 m 2 1 2 Ekin = vn = mN v2 . 3 3 n 2 3

(2.6.36)

For 2 non-interacting classical particles, the mean squared velocity per particle, v , can be computed using the Maxwell velocity distribution; then from (2.6.36), one again obtains the well-known equation of state of the classical ideal gas. 11

See e.g. QM I, p. 218.

58

2. Equilibrium Ensembles

2.6.5 Thermodynamic Quantities in the Canonical Ensemble 2.6.5.1 A Macroscopic System: The Equivalence of the Canonical and the Microcanonical Ensemble We assume that the smaller subsystem is also a macroscopic system. Then it follows from the preceding considerations on the width of the energy distribution function ω (E1 ) that the average value of the energy E¯1 is equal to ˜1 , i.e. the most probable value E ¯1 = E ˜1 . E

(2.6.37)

We now wish to investigate how statements about thermodynamic quantities in the microcanonical and the canonical ensembles are related. To this end, we rewrite the partition function (2.6.4) in the following manner: Z=

Ω1,2 (E) ˜ ˜1 )−1 Ω1 (E˜1 )e−E˜1 /kT . Ω (E˜ )e−E1 /kT = ω(E ˜1 )Ω2 (E − E ˜1 ) 1 1 Ω1 (E (2.6.38)

According to (2.4.8), the typical N1 -dependence of ω (E1 ) is given by −1

ω (E1 ) ∼ N1 2 e− 4 (E1 −E1 ) 3

˜

2

/N1 e¯2

,

(2.6.39)

with the normalization factor determined by the condition dE1 ω (E1 ) = 1. From (2.4.14), the N1 -dependence takes the form of Eq. (2.6.39) even for interacting systems. We thus ﬁnd from (2.6.38) that ˜ ˜ 1 ) N1 . Z = e−E1 /kT Ω1 (E (2.6.40) Inserting this result into Eq. (2.6.18), we obtain the following expression for the canonical entropy [using (2.6.37) and neglecting terms of the order of log N1 ]: SC =

1¯ ˜1 ) = SM C (E ˜1 ) . E1 − E˜1 + kT log Ω1 (E T

(2.6.41)

From (2.6.41) we can see that the entropy of the canonical ensemble is equal ˜1 (= E ¯1 ). In both to that of a microcanonical ensemble with the energy E ensembles, one obtains identical results for the thermodynamic quantities. 2.6.5.2 Thermodynamic Quantities We summarize here how various thermodynamic quantities can be calculated for the canonical ensemble. Since the heat bath enters only through its temperature T , we leave oﬀ the index 1 which indicates subsystem 1. Then for the canonical density matrix, we have ρC = e−βH /Z

(2.6.42)

2.6 The Canonical Ensemble

59

with the partition function Z = Tr e−βH , where we have used the deﬁnition β =

(2.6.43) 1 kT

. We also deﬁne the free energy

F = −kT log Z .

(2.6.44)

For the entropy, we obtain from (2.6.18) SC =

1¯ E + kT log Z . T

(2.6.45)

The average energy is given by ¯ = H = − ∂ log Z = kT 2 ∂ log Z . E ∂β ∂T The pressure takes the form: ∂H ∂ log Z P =− = kT . ∂V ∂V

(2.6.46)

(2.6.47)

The derivation from Sect. 2.4.3, which gave − ∂H for the pressure, is of ∂V course still valid for the canonical ensemble. From Eq. (2.6.45), it follows that F = E¯ − T SC .

(2.6.48)

Since the canonical density matrix contains T and V as parameters, F is likewise a function of these quantities. Taking the total diﬀerential of (2.6.44) by applying (2.6.43), we obtain dT 1 ∂H −βH Tr ( kT 2 H − kT ∂V dV )e −βH dF = −k dT log Tr e − kT −βH Tr e ∂H 1 ¯ dV = − (E + kT log Z)dT + T ∂V and, with (2.6.45)–(2.6.47), dF (T, V ) = −SC dT − P dV .

(2.6.49)

From Eqs. (2.6.48) and (2.6.49) we ﬁnd ¯ = T dSC − P dV . dE

(2.6.50a)

This relation corresponds to (2.4.25) in the microcanonical ensemble. In the ¯=E ˜ = E and SC = SM C . limiting case of macroscopic systems, E

60

2. Equilibrium Ensembles

The First Law of thermodynamics expresses the energy balance. The most general change in the energy of a system with a ﬁxed number of particles is composed of the work δW = −P dV performed on the system together with the quantity of heat δQ transferred to it: dE = δQ + δW .

(2.6.50b)

Comparison with (2.6.50a) shows that the heat transferred is given by δQ = T dS

(2.6.50c)

(this is the Second Law for transitions between equilibrium states). The temperature and the volume occur in the canonical partition function and in the free energy as natural variables. The partition function is calculated for a Hamiltonian with a ﬁxed number of particles.12 As in the case of the microcanonical ensemble, however, one can here also treat the partition function or the free energy, in which the particle number is a parameter, as a function of N . Then the total change in F is given by

∂F dF = −SC dT − P dV + dN , (2.6.51) ∂N T,V and it follows from (2.6.48) that

¯ = T dSC − P dV + ∂F dN . dE ∂N T,V

(2.6.52)

In the thermodynamic limit, (2.6.52) and (2.4.29) must agree, so that we ﬁnd

∂F =µ. (2.6.53) ∂N T,V 2.6.6 Additional Properties of the Entropy 2.6.6.1 Additivity of the Entropy We now consider two subsystems in a common heat bath (Fig. 2.13). Assuming that each of these systems contains a large number of particles, the energy is additive. That is, the interaction energy, which acts only at the interfaces, is much smaller than the energy of each of the individual systems. We wish to show that the entropy is also additive. We begin this task with the two density matrices of the subsystems: ρ1 = 12

e−βH1 , Z1

ρ2 =

e−βH2 . Z2

(2.6.54a,b)

Exceptions are photons and bosonic quasiparticles such as phonons and rotons in superﬂuid helium, for which the particle number is not ﬁxed (Chap. 4).

2.6 The Canonical Ensemble

61

Fig. 2.13. Two subsystems 1 and 2 in one heat bath

The density matrix of the two subsystems together is ρ = ρ1 ρ 2 ,

(2.6.54c)

where once again W H1 , H2 was employed. From log ρ = log ρ1 + log ρ2

(2.6.55)

it follows that the total entropy S is given by S = S1 + S 2 ,

(2.6.56)

the sum of the entropies of the subsystems. Eq. (2.6.56) expresses the fact that the entropy is additive. ∗

2.6.6.2 The Statistical Meaning of Heat

Here, we want to add a few supplementary remarks that concern the statistical and physical meaning of heat transfer to a system. We begin with the average energy ¯ = H = Tr ρH E

(2.6.57a)

for an arbitrary density matrix and its total variation with a ﬁxed number of particles ¯ = Tr dρ H + ρ dH , dE (2.6.57b) where dρ is the variation of the density matrix and dH is the variation of the Hamiltonian (see the end of this section). The variation of the entropy S = −k Tr ρ log ρ

(2.6.58)

is given by ρ dS = −k Tr dρ log ρ + dρ . ρ

(2.6.59)

Now we have Tr dρ = 0 ,

(2.6.60)

62

2. Equilibrium Ensembles

since for all density matrices, Tr ρ = Tr (ρ + dρ) = 1, from which it follows that dS = −k Tr log ρ dρ . (2.6.61) Let the initial density matrix be the canonical one; then making use of (2.6.60), we have dS =

1 Tr (H dρ) . T

(2.6.62)

If we insert this into (2.6.57b) and take the volume as the only parameter in H, i.e. dH = ∂H ∂V dV , we again obtain (cf. (2.6.50a)) ∂H ¯ dV . (2.6.63) dE = T dS + ∂V We shall now discuss the physical meaning of the general relation (2.6.57b): 1st term: this represents a change in the density matrix, i.e. a change in the occupation probabilities. 2nd term: the change of the Hamiltonian. This means a change in the energy as a result of inﬂuences which change the energy eigenvalues of the system. Let ρ be diagonal in the energy eigenstates; then ¯= E pi Ei , (2.6.64) i

and the variation of the average energy has the form ¯= dE dpi Ei + pi dEi . i

(2.6.65)

i

Thus, the quantity of heat transferred is given by δQ = dpi Ei .

(2.6.66)

i

A transfer of heat gives rise to a redistribution of the occupation probabilities of the states |i. Heating (heat input) increases the populations of the states at higher energies. Energy change by an input of work (work performed on the system) produces a change in the energy eigenvalues. In this process, the occupation numbers can change only in such a way as to keep the entropy constant. When only the external parameters are varied, work is performed on the system, but no heat is put into it. In this case, although dρ may exhibit a change, there is no change in the entropy. This can be shown explicitly as follows: From Eq. (2.6.61), we have dS = −kTr (log ρdρ). It then follows from the von Neumann Eq. (1.4.8), ρ˙ = i [ρ, H(V (t))], which is valid also

2.7 The Grand Canonical Ensemble

63

for time-dependent Hamiltonians, e.g. one containing the volume V (t): S˙ = −k Tr log ρ ρ˙ (2.6.67) ik ik = − Tr log ρ [ρ, H] = − Tr H [log ρ, ρ] = 0 . The entropy does not change, and no heat is put into the system. An example which demonstrates this situation is the adiabatic reversible expansion of an ideal gas (Sect. 3.5.4.1). There, as a result of the work performed, the volume of the gas changes and with it the Hamilton function; furthermore, the temperature of the gas changes. These eﬀects together lead to a change in the distribution function (density matrix), but however not of the entropy.

2.7 The Grand Canonical Ensemble 2.7.1 Systems with Particle Exchange After considering systems in the preceding section which can exchange energy with a heat bath, we now wish to allow in addition the exchange of matter between subsystem 1 on the one hand and the heat bath 2 on the other; this will be a consistent generalization of the canonical ensemble (see Fig. 2.14). The overall system is isolated. The total energy, the total particle number and the overall volume are the sums of these quantities for the subsystems: E = E1 + E2 ,

N = N1 + N2 ,

V = V1 + V2 .

(2.7.1)

Fig. 2.14. Regarding the grand canonical ensemble: two subsystems 1 and 2, between which energy and particle exchange is permitted.

The probability distribution of the state variables E1 , N1 , and V1 of subsystem 1 is found in complete analogy to Sect. 2.4.3, ω (E1 , N1 , V1 ) =

Ω1 (E1 , N1 , V1 ) Ω2 (E − E1 , N − N1 , V − V1 ) . Ω (E, N, V )

(2.7.2)

The attempt to ﬁnd the maximum of this distribution leads again to equality of the logarithmic derivatives, in this case with respect to E, V and N . The

64

2. Equilibrium Ensembles

ﬁrst two relations were already seen in Eq. (2.4.28) and imply temperature and pressure equalization between the two systems. The third formula can be expressed in terms of the chemical potential which was deﬁned in (2.4.29):

∂ ∂S µ = −kT log Ω (E, N, V ) = −T , (2.7.3) ∂N ∂N E,V and we obtain ﬁnally as a condition for the maximum probability the equalization of temperature, pressure, and chemical potential: T1 = T2 ,

P1 = P2 ,

µ1 = µ2 .

(2.7.4)

2.7.2 The Grand Canonical Density Matrix Next, we will derive the density matrix for the subsystem. The probability that in system 1 there are N1 particles which are in the state |n at the energy E1n (N1 ) is given by:

p(N1 , E1n (N1 ), V1 ) =

E−E1n (N1 )≤E2m (N2 )≤E−E1n (N1 )+∆

=

1 Ω (E, N, V )∆

Ω2 (E − E1n , N − N1 , V2 ) . Ω (E, N, V ) (2.7.5)

In order to eliminate system 2, we carry out an expansion in the variables E1n and N1 with the condition that subsystem 1 is much smaller than subsystem 2, analogously to the case of the canonical ensemble: −1 −(E1n −µN1 )/kT p(N1 , E1n (N1 ), V1 ) = ZG e .

(2.7.6)

We thus obtain the following expression for the density matrix of the grand canonical ensemble13 : −1 −(H1 −µN1 )/kT ρG = Z G e ,

(2.7.7)

where the grand partition function ZG (or Gibbs distribution) is found from the normalization of the density matrix to be ZG = Tr e−(H1 −µN1 )/kT Tr e−H1 /kT +µN1 /kT = Z(N1 ) eµN1 /kT . (2.7.8) = N1

13

N1

See also the derivation in second quantization, p. 69

2.7 The Grand Canonical Ensemble

65

The two trace operations Tr in Eq. (2.7.8) refer to diﬀerent spaces. The trace after the second equals sign refers to a summation over all the diagonal matrix elements for a ﬁxed particle number N1 , while the Tr after the ﬁrst equals sign implies in addition the summation over all particle numbers N1 = 0, 1, 2, . . .. The average value of an operator A in the grand canonical ensemble is A = Tr (ρG A) , where the trace is here to be understood in the latter sense. In classical statistics, remains unchanged for the distribution func (2.7.7) tion, while Tr −→ dΓ N1 must be replaced by the 6N1 -dimensional N1 dq dp operator dΓN1 = h3N1 N1 ! . −1 From (2.7.5), ZG can also be given in terms of −1 ZG =

Ω2 (E, N, V − V1 ) = e−P V1 /kT Ω (E, N, V )

(2.7.9)

for V1 V ; recall Eqns. (2.4.24) and (2.4.25).

From the density matrix, we ﬁnd the entropy of the grand canonical ensemble, SG = −klog ρG =

1 ¯ ¯ ) + k log ZG . (E − µN T

(2.7.10)

Since the energy and particle reservoir, subsystem 2, enters only via its temperature and chemical potential, we dispense with the index 1 here and in the following sections. The distribution function for the energy and the particle number is extremely narrow for macroscopic subsystems. The relative ﬂuctuations are proportional to the square root of the average number of particles. There¯ = E ˜ and N ¯ = N ˜ for macroscopic subsystems. The grand fore, we have E canonical entropy, also, may be shown (cf. Sect. 2.6.5.1) in the limit of macroscopic subsystems to be identical with the microcanonical entropy, taken at the most probable values (with ﬁxed volume V1 ) ˜1 = E ¯1 , E

˜1 = N ¯1 N

˜1 ) . SG = SM C (E˜1 , N

(2.7.11) (2.7.12)

2.7.3 Thermodynamic Quantities In analogy to the free energy of the canonical ensemble, the grand potential is deﬁned by Φ = −kT log ZG ,

(2.7.13)

66

2. Equilibrium Ensembles

from which with (2.7.10) we obtain the expression ¯ − T SG − µN ¯ . Φ (T, µ, V ) = E The total diﬀerential of the grand potential is given by

∂Φ ∂Φ ∂Φ dΦ = dT + dV + dµ . ∂T V,µ ∂V T,µ ∂µ V,T

(2.7.14)

(2.7.15)

The partial derivatives follow from (2.7.13) and (2.7.8):

∂Φ ∂T ∂Φ ∂V

1 1 ¯ + µN ¯ ) = −SG H − µN = (Φ − E 2 kT T

∂Φ ∂H 1 ¯ . = −P , N = −N = = −kT ∂V ∂µ T,V kT = −k log ZG − kT

V,µ

T,µ

(2.7.16) If we insert (2.7.16) into (2.7.15), we ﬁnd ¯ dΦ = −SG dT − P dV − Ndµ .

(2.7.17)

From this, together with (2.7.14), it follows that ¯ = T dSG − P dV + µdN ¯ ; dE

(2.7.18)

this is again the First Law. As shown above, for macroscopic systems we can use simply E, N and S in (2.7.17) and (2.7.18) instead of the average values of the energy and the particle number and SG ; we shall do this in later chapters. For a constant particle number, (2.7.18) becomes identical with (2.4.25). The physical meaning of the First Law will be discussed in detail in Sect. 3.1. We have considered the ﬂuctuations of physical quantities thus far only in Sect. 2.4.2. Of course, we could also calculate the autocorrelation function for energy and particle number in the grand canonical ensemble. This shows that these quantities are extensive and their relative ﬂuctuations decrease inversely as the square root of the size of the system. We shall postpone these considerations to the chapter on thermodynamics, since there we can relate the correlations to thermodynamic derivatives. We close this section with a tabular summary of the ensembles treated in this chapter. Remark concerning Table 2.1: The thermodynamic functions which are found from the logarithm of the normalization factors are the entropy and the thermodynamic potentials F and Φ (see Chap. 3). The generalization to several diﬀerent types of particles will be carried out in Chap. 5. To this end, one must merely replace N by {Ni } and µ by {µi }.

2.7 The Grand Canonical Ensemble

67

Table 2.1. The most important ensembles Ensemble Physical tion

microcanonical

canonical

grand canonical

isolated

energy exchange

energy and particle exchange

× δ(H − E)

1 e−H/kT Z(T,V,N)

1 × ZG (T,V,µ) −(H−µN)/kT

Ω (E, V, N ) = Tr δ(H − E)

Z(T, V, N ) = Tr e−H/kT

ZG (T, V, µ) = Tr e−(H−µN)/kT

E, V, N

T, V, N

T, V, µ

S

F

Φ

situa-

1 Ω (E,V,N)

Density matrix Normalization Independent variables Thermodynamic functions

e

2.7.4 The Grand Partition Function for the Classical Ideal Gas As an example, we consider the special case of the classical ideal gas. 2.7.4.1 Partition Function For the partition function for N particles, we obtain P 2 1 dp1 . . . dp3N e−β pi /2m dq . . . dq ZN = 1 3N 3N N! h N

V = N!

V

2mπ βh2

3N 2

1 = N!

V λ3

N

with the thermal wavelength √ λ = h/ 2πmkT .

(2.7.19)

(2.7.20)

Its name results from the fact that a particle of mass m and momentum h/λ will have a kinetic energy of the order of kT . 2.7.4.2 The Grand Partition Function Inserting (2.7.19) into the grand partition function (2.7.8), we ﬁnd

N ∞ ∞ 3 1 βµN V ZG = e eβµN ZN = = ezV /λ , (2.7.21) N! λ3 N =0

N =0

68

2. Equilibrium Ensembles

where the fugacity z = eβµ

(2.7.22)

has been deﬁned. 2.7.4.3 Thermodynamic Quantities From (2.7.13) and (2.7.21), the grand potential takes on the simple form Φ ≡ −kT log ZG = −kT zV /λ3 .

(2.7.23)

From the partial derivatives, we can compute the thermodynamic relations.14 Particle number

∂Φ = zV /λ3 (2.7.24) N =− ∂µ T,V Pressure

P V = −V

∂Φ ∂V

= −Φ = N kT

(2.7.25)

T,µ

This is again the thermal equation of state of the ideal gas, as found in Sect. 2.5. For the chemical potential, we ﬁnd from (2.7.22), (2.7.24), and (2.7.23)

V /N kT kT µ = −kT log = −kT log = kT log P − kT log 3 . (2.7.26) 3 3 λ Pλ λ For the entropy, we ﬁnd

µ V 5 V ∂Φ S=− = kz 3 + kT − 2 z 3 ∂T V,µ 2 λ kT λ

V /N 5 + log 3 , = kN 2 λ

(2.7.27)

and for the internal energy, from (2.7.14), we obtain 5 3 E = Φ + T S + µN = N kT (−1 + ) = N kT . 2 2 14

(2.7.28)

¯ For the reasons mentioned at the end of the preceding section, we replace E ¯ in (2.7.16) and (2.7.17) by E and N . and N

2.7 The Grand Canonical Ensemble

69

∗

2.7.5 The Grand Canonical Density Matrix in Second Quantization The derivation of ρG can be carried out most concisely in the formalism of the second quantization. In addition to the Hamiltonian H, expressed in terms of the ﬁeld operators ψ(x) (see Eq. (1.5.6d) in QM II15 ), we require the particle-number operator, Eq. (1.5.10)15 ˆ = d3 x ψ † (x)ψ(x) . (2.7.29) N V

The microcanonical density matrix for ﬁxed volume V is 1 ˆ − N) . δ(H − E)δ(N (2.7.30) ρM C = Ω(E, N, V ) Corresponding to the division of the overall volume into two subvolumes, V = ˆ =N ˆ1 + N ˆ2 with N ˆi = V1 +V2 , we have H = H1 +H2 and N d3 x ψ † (x)ψ(x), Vi i = 1, 2. We ﬁnd from (2.7.30) the probability that the energy and the particle number in subvolume 1 assume the values E1 and N1 : ω(E1 , V1 , N1 ) 1 ˆ − N )δ(H1 − E1 )δ(N ˆ 1 − N1 ) δ(H − E)δ(N = Tr Ω(E, N, V ) 1 ˆ2 − (N − N1 )) δ(H2 − (E − E1 ))δ(N = Tr Ω(E, N, V ) ˆ 1 − N1 ) ×δ(H1 − E1 )δ(N =

Ω1 (E1 , N1 , V1 )Ω2 (E − E1 , N − N1 , V − V1 ) . Ω(E, N, V )

(2.7.31)

The (grand canonical) density matrix for subsystem 1 is found by taking the trace of the density matrix of the overall system over subsystem 2, with respect to both the energy and the particle number: 1 ˆ − N) ρG = Tr2 δ(H − E)δ(N Ω(E, N, V ) (2.7.32) ˆ1 , V − V1 ) Ω2 (E − H1 , N − N = . Ω(E, N, V ) ˆ1 leads to Expansion of the logarithm of ρG in terms of H1 and N ˆ1 )/kT −1 −(H1 − µN e ρG = Z G ˆ ZG = Tr e−(H1 − µN1 )/kT ,

(2.7.33)

consistent with Equations (2.7.7) and (2.7.8), which were obtained by considering the probabilities. 15

F. Schwabl, Advanced Quantum Mechanics (QM II), 3rd ed., Springer Berlin, Heidelberg, New York 2005. This text will be cited in the rest of this book as QM II.

70

2. Equilibrium Ensembles

Problems for Chapter 2 2.1 Calculate Ω(E) for a spin system which is described by the Hamiltonian H = µB H

N X

Si ,

i=1

where Si can take on the values Si = ±1/2 X 1. Ω(E)∆ = E≤En ≤E+∆

Use a combinatorial method, rather than 2.2.3.2. ˙ ¸

˙ ¸

2.2 For a one-dimensional classical ideal gas, calculate p21 and p41 . Zπ sinm x cosn x dx =

Formula: 0

Γ

` m+1 ´ ` n+1 ´ Γ 2 2 ` ´ . Γ n+m+2 2

2.3 A particle is moving in one dimension; the distance between the walls of the container is changed by a piston at L. Compute the change in the phase-space ¯ = 2Lp (p = momentum). volume Ω (a) For a slow, continuous motion of the piston. (b) For a rapid motion of the piston between two reﬂections of the particle. ¯ 2.4 Assume that the entropy S depends on the volume Ω(E) inside the energy

¯ Show that from the additivity of S and the multiplicative character shell: S = f (Ω). ¯ it follows that S = const × log Ω. ¯ of Ω,

2.5 (a) For a classical ideal gas which is enclosed within a volume V , calculate the free energy and the entropy, starting with the canonical ensemble. (b) Compare them with the results of Sect. 2.2.

2.6 Using the assertion that the entropy S = −k Tr (ρ log ρ) is maximal, show that

¯ for ρ, the canonical density matrix with the conditions Tr ρ = 1 and Tr ρH = E results. Hint: This is a variational problem with constraints, which can be solved using the method of Lagrange multipliers.

2.7 Show that for the energy distribution in the classical canonical ensemble Z ω(E1 ) =

dΓ1 ρK δ(H1 − E1 ) = Ω1 (E1 )

˜2 ) E˜ /kT −E /kT Ω2 (E Ω2 (E − E1 ) Ω1 (E1 ) e 1 ≈ e 1 . Ω1,2 (E) Ω1,2 (E)

(2.7.34)

2.8 Consider a system of N classical, non-coupled one-dimensional harmonic oscillators and calculate for this system the entropy and the temperature, starting from the microcanonical ensemble.

Problems for Chapter 2

71

2.9 Consider again the harmonic oscillators from problem 2.8 and calculate for this system the average value of the energy and the entropy, starting with the canonical ensemble. 2.10 In analogy to the preceding problems, consider N quantum-mechanical noncoupled one-dimensional harmonic oscillators and compute the average value of the ¯ and the entropy, beginning with the canonical ensemble. Also investigate energy E ¯ lim→0 S and limT →0 S, and compare the limiting values you obtain with lim→0 E, the results of problem 2.9. 2.11 For the Maxwell distribution, ﬁnd ¸ ˙ (a) the average value of the nth power of the velocity v n , (b) v, (c) (v − v)2 , ` m ´2 ˙ 2 ˙ 2 ¸ 2 ¸ (d) 2 (v − v ) , and (e) the most probable value of the velocity. 2.12 Determine the number of collisions of a molecule of an ideal gas with the wall of its container per unit area and unit time, when (a) the angle between the normal to the wall and the direction of the velocity lies between Θ and Θ + dΘ; (b) the magnitude of the velocity lies between v and v + dv. 2.13 Calculate the pressure of a Maxwellian gas with the velocity distribution „ f (v) = n

mβ 2π

«3 2

e−

βmv2 2

.

Suggestions: the pressure is produced by reﬂections of the particles from the walls of the container; it is therefore the average force on an area A of wall which acts over a time interval τ . P =

1 τA

Zτ dt Fx (t) . 0

If a particle is reﬂected from the wall with the velocity v, its contribution is given Rτ from Newton’s 2nd axiom in terms of dt Fx (t) by the momentum transferred per 0 P collision, 2mvx . Then P = τ1A 2mvx ,whereby the sum extends over all particles which reach the area A within the time τ . Result: P = nkT .

2.14 A simple model for thermalization: Calculate the average kinetic energy of a particle of mass m1 with the velocity v1 due to contact with an ideal gas consisting of particles of mass m2 . As a simpliﬁcation, assume that only elastic and linear collisions occur. The eﬀect on the ideal gas can be neglected. It is helpful to use the abbreviations M = m1 + m2 and m = m1 − m2 . How many collisions are required until, for m1 = m2 , a temperature equal to the (1 − e−1 )-fold temperature of the ideal gas is attained?

72

2. Equilibrium Ensembles

2.15 Using the canonical ensemble, calculate the average value of the particlenumber density n(x) =

N X

δ(x − xi )

i=1

for an ideal gas which is contained in an inﬁnitely high cylinder of cross-sectional area A in the gravitational ﬁeld of the Earth. The potential energy of a particle in the gravitational ﬁeld is mgh. Also calculate (a) the internal energy of this system, (b) the pressure at the height (altitude) h, using the deﬁnition Z∞ P =

˙ ¸ n(x) mg dz ,

h

(c) the average distance z of an oxygen molecule and a helium atom from the surface of the Earth at a temperature of 0◦ C, and (d) the mean square deviation ∆z for the particles in 2.15c. At this point, we mention the three diﬀerent derivations of the barometric pressure formula, each emphasizing diﬀerent physical aspects, in R. Becker, Theory of Heat, 2nd ed., Sec. 27, Springer, Berlin 1967.

2.16 The potential energy of N non-interacting localized dipoles depends on their orientations relative to an applied magnetic ﬁeld H: H = −µHz

N X

cos ϑi .

i=1

Calculate the partition function and show that the magnetization along the zdirection takes the form Mz =

N DX

E ` ´ µ cos ϑi = N µ L βµHz ;

L(x) = Langevin function .

i=1

Plot the Langevin function. How large is the magnetization at high temperatures? Show that at high temperatures, the Curie law for the magnetic susceptibility holds: „ « ∂Mz ∼ const/T . χ = lim Hz →0 ∂Hz

2.17 Demonstrate the equipartition theorem and the virial theorem making use of the microcanonical distribution. 2.18 In the extreme relativistic P case, the Hamilton function for N particles in three-dimensional space is H = i |pi |c. Compute the expectation value of H with the aid of the virial theorem. 2.19 Starting with the canonical ensemble of classical statistics, calculate the equation of state and the internal energy of a gas composed of N indistinguishable particles with the kinetic energy ε(p) = |p| · c.

Problems for Chapter 2

73

2.20 Show that for an ideal gas, the probability of ﬁnding a subsystem in the grand canonical ensemble with N particles is given by the Poisson distribution: pN =

1 −N¯ ¯ N N , e N!

¯ is the average value of N in the ideal gas. where N ¯. Suggestions: Start from pN = eβ(Φ+Nµ) ZN . Express Φ, µ, and ZN in terms of N

2.21 (a) Calculate the grand partition function for a mixture of two ideal gases (2 chemical potentials!). (b) Show that ´ ` and P V = N1 + N2 kT ´ 3` E= N1 + N2 kT 2 are valid, where N1 , N2 and E are the average particle number and the average energy.

2.22 (a) Express E¯ by taking an appropriate derivative of the grand partition function. ¯ (b) Express (∆E)2 in terms of a thermodynamic derivative of E. 2.23 Calculate the density matrix in the x-representation for a free particle within a three-dimensional cube of edge length L: X −βE ˙ ¸ n e x|n n|x ρ(x, x ) = c n

where c is a normalization constant. Assume that L is so large that one can go to the limit of a continuous momentum spectrum Z X 1 L3 d3 p −→ ; x|n −→ x|p = 3/2 eipx/ . 3 (2π) L n

2.24 Calculate the canonical density matrix for a one-dimensional harmonic oscil2 2 lator H = −(2 /2m)(d2 /d2 x) + mω2 x in the x-representation at low temperatures: X −βE ˙ ¸ n ρ(x, x ) = c e x|n n|x , n

where c is the normalization constant. x|n = (π

1/2

n

2 n! x0 )

−1/2 −(x/x0 )2 /2

e

„ Hn

x x0

r

« ;

x0 =

. ωm

The Hermite polynomials are deﬁned in problem 2.27. Suggestion: Consider which state makes the largest contribution.

2.25 Calculate the time average of q 2 for the example of problem 1.7, as well as its average value in the microcanonical ensemble.

74

2. Equilibrium Ensembles

2.26 Show that: Z dq1 . . . dqd f (q 2 , q · k) Z ∞ Z dq q d−1 = (2π)−1 Kd−1 0

π

dΘ(sin Θ)d−2 f (q 2 , qk cos Θ) ,

(2.7.35)

0

where k ∈ Rd is a ﬁxed vector and q = |q|, k = |k|, and Kd = 2−d+1 π −d/2 × ` d ´−1 Γ(2) .

2.27 Compute the matrix elements of the canonical density matrix for a onedimensional harmonic oscillator in the coordinate representation, ˛ ¸ ˛ ¸ ρx,x = x| ρ ˛x = x| e−βH ˛x . Hint: Use the completeness relation for the eigenfunctions of the harmonic oscillator and use the fact that the Hermite polynomials have the integral representation Hn (ξ) = (−1)n eξ

2

„

d dξ

«n

2

2 eξ e−ξ = √ π

Z

∞

(−2iu)n e−u

2

+2iξu

du .

−∞

Alternatively, the ﬁrst representation for Hn (x) and the identity from the next example can be used. Result: ρx,x

» –1/2 mω 1 = Z 2π sinh βω j ”ﬀ mω “ 1 1 . × exp − (x + x )2 tanh βω + (x − x )2 ctgh βω 4 2 2

(2.7.36)

2.28 Prove the following identity: ∂

∂

e ∂x Π ∂x e−x∆x = p

∆ 1 e−x 1+4∆Π x . Det(1 + 4∆Π)

Here, Π and ∆ ate two commuting symmetric matrices, e.g.

∂ ∂ Π ∂x ∂x

≡

∂ Πik ∂x∂ k . ∂xi

3. Thermodynamics

3.1 Thermodynamic Potentials and the Laws of Equilibrium Thermodynamics 3.1.1 Deﬁnitions Thermodynamics treats the macroscopic properties of macroscopic systems. The fact that macroscopic systems can be completely characterized by a small number of variables, such as their energy E, volume V , and particle number N , and that all other quantities, e.g. the entropy, are therefore functions of only these variables, has far-reaching consequences. In this section, we consider equilibrium states and transitions from one equilibrium state to another neighboring equilibrium state. In the preceding sections, we have already determined the change in the entropy due to changes in E, V and N , whereby the system goes from one equilibrium state E, V, N into a new equilibrium state E + dE, V + dV, N + dN . Building upon the diﬀerential entropy (2.4.29), we will investigate in the following the First Law and the signiﬁcance of the quantities which occur in it. Beginning with the internal energy, we will then deﬁne the most important thermodynamic potentials and discuss their properties. We assume the system we are considering to consist of one single type of particles of particle number N . We start with its entropy, which is a function of E, V, and N . Entropy : S = S(E, V, N ) In (2.4.29), we found the diﬀerential entropy to be dS =

1 P µ dE + dV − dN . T T T

(3.1.1)

From this, we can read oﬀ the partial derivatives:

∂S ∂S ∂S 1 P µ , , = = = − , (3.1.2) ∂E V,N T ∂V E,N T ∂N E,V T which naturally agree with the deﬁnitions from equilibrium statistics. We can now imagine the equation S = S(E, V, N ) to have been solved for E and

76

3. Thermodynamics

thereby obtain the energy E, which in thermodynamics is usually termed the internal energy, as a function of S, V, and N . Internal Energy : E = E(S, V, N ) From (3.1.1), we obtain the diﬀerential relation dE = T dS − P dV + µdN .

(3.1.3)

We are now in a position to interpret the individual terms in (3.1.3), keeping in mind all the various possibilities for putting energy into a system. This can be done by performing work, by adding matter (i.e. by increasing the number of particles), and through contact with other bodies, whereby heat is put into the system. The total change in the energy is thus composed of the following contributions: dE

= δQ + δW ⏐ ⏐ ↓

+

heat input mechanical work

δE ⏐N ⏐ ⏐ ⏐

(3.1.3 )

energy increase through addition of matter .

The second term in (3.1.3) is the work performed on the system, δW = −P dV ,

(3.1.4a)

while the third term gives the change in the energy on increasing the particle number δEN = µdN .

(3.1.4b)

The chemical potential µ has the physical meaning of the energy increase on adding one particle to the system (at constant entropy and volume). The ﬁrst term must therefore be the energy change due to heat input δQ, i.e. δQ = T dS .

(3.1.5)

Relation (3.1.3), the law of conservation of energy in thermodynamics, is called the First Law of Thermodynamics. It expresses the change in energy on going from one equilibrium state to another, nearby state an inﬁnitesimal distance away. Equation (3.1.5) is the Second Law for such transitions. We will formulate the Second Law in a more general way later. In this connection, we will also clarify the question of under what conditions these relations of equilibrium thermodynamics can be applied to real thermodynamic processes which proceed at ﬁnite rates, such as for example the operation of steam engines or of internal combustion engines.

3.1 Potentials and Laws of Equilibrium Thermodynamics

77

Remark: It is important to keep the following in mind: δW and δQ do not represent changes of state variables. There are no state functions (functions of E, V and N ) identiﬁable with W und Q. An object cannot be characterized by its ‘heat or work content’, but instead by its internal energy. Heat (∼ energy transfer into an object through contact with other bodies) and work are ways of transferring energy from one body to another. It is often expedient to consider other quantities – with the dimensions of energy – in addition to the internal energy itself. As the ﬁrst of these, we deﬁne the free energy: F ree Energy (Helmholtz F ree Energy) : F = F (T, V, N ) The free energy is deﬁned by F = E − TS = −kT log Z(T, V, N ) ; (3.1.6) in parentheses, we have given its connection with the canonical partition function (Chap. 2). From (3.1.3), the diﬀerential free energy is found to be: dF = −SdT − P dV + µdN with the partial derivatives

∂F ∂F = −S , = −P , ∂T V,N ∂V T,N

(3.1.7)

∂F ∂N

=µ.

(3.1.8)

T,V

We can see from (3.1.8) that the internal energy can be written in terms of F in the form

∂F ∂ F 2 E =F −T = −T . (3.1.9) ∂T V,N ∂T T V,N From (3.1.7), it can be seen that the free energy is that portion of the energy which can be set free as work in an isothermal process; here we assume that the particle number N remains constant. In an isothermal volume change, the change of the free energy is given by (dF )T,N = −P dV = δW , while (dE)T,N = δA, since one would have to transfer heat into or out of the system in order to hold the temperature constant. Enthalpy : H = H(S, P, N ) The enthalpy is deﬁned as H = E + PV .

(3.1.10)

From (3.1.3), it follows that dH = T dS + V dP + µdN and from this, its partial derivatives can be obtained:

(3.1.11)

78

3. Thermodynamics

∂H ∂S

=T ,

P,N

∂H ∂P

=V ,

S,N

∂H ∂N

=µ.

(3.1.12)

S,P

For isobaric processes, (dH)P,N = T dS = δQ = dE + P dV , thus the change in the enthalpy is equal to the change in the internal energy plus the energy change in the device supplying constant pressure (see Fig. 3.1). The weight FG including the piston of area A holds the pressure constant at P = FG /A. The change in the enthalpy is the sum of the change in the internal energy and the change in the potential energy of the weight. For a process at constant pressure, the heat δQ supplied to the system equals the increase in the system’s enthalpy.

FG

Fig. 3.1. The change in the enthalpy in isobaric processes; the weight FG produces the constant pressure P = FG /A, where A is the area of the piston.

F ree Enthalpy (Gibbs F ree Energy) : The Gibbs’ free energy is deﬁned as

G = G(T, P, N )

G = E − TS + PV .

(3.1.13)

Its diﬀerential follows from (3.1.3): dG = −SdT + V dP + µdN .

(3.1.14)

From Eq. (3.1.14), we can immediately read oﬀ

∂G ∂G ∂G = −S , =V , =µ. ∂T P,N ∂P T,N ∂N T,P T he Grand P otential : Φ = Φ(T, V, µ) The grand potential is deﬁned as Φ = E − T S − µN = −kT log ZG (T, V, µ) ;

(3.1.15)

(3.1.16)

in parentheses we give the connection to the grand partition function (Chap. 2). The diﬀerential expressions are dΦ = −SdT − P dV − N dµ ,

∂Φ ∂Φ = −S , = −P , ∂T V,µ ∂V T,µ

(3.1.17)

∂Φ ∂µ

= −N . (3.1.18) T,V

3.1 Potentials and Laws of Equilibrium Thermodynamics

79

3.1.2 The Legendre Transformation The transition from E to the thermodynamic potentials deﬁned in (3.1.6), (3.1.10), (3.1.13), and (3.1.16) was carried out by means of so-called Legendre transformations, whose general structure will now be considered. We begin with a function Y which depends on the variables x1 , x2 , . . ., Y = Y (x1 , x2 , . . .) .

(3.1.19)

The partial derivatives of Y in terms of the xi are

∂Y . ai (x1 , x2 , . . .) = ∂xi {xj ,j =i}

(3.1.20a)

Our goal ∂Yis now to replace the independent variable x1 by the partial derivatives ∂x as independent variables, i.e. for example to change from the 1 independent variable S to T . This has a deﬁnite practical application, since the temperature is directly and readily measurable, while the entropy is not. The total diﬀerential of Y is given by dY = a1 dx1 + a2 dx2 + . . .

(3.1.20b)

From the rearrangement dY = d(a1 x1 ) − x1 da1 + a2 dx2 + . . ., it follows that d(Y − a1 x1 ) = −x1 da1 + a2 dx2 + . . .

.

(3.1.21)

It is then expedient to introduce the function Y1 = Y − a1 x1 ,

(3.1.22)

and to treat it as a function of the variables a1 , x2 , . . . (natural variables).1 Thus, for example, the natural variables of the (Helmholtz) free energy are T, V , and N . The diﬀerential of Y1 (a1 , x2 , . . .) has the following form in terms of these independent variables: dY1 = −x1 da1 + a2 dx2 + . . . and its partial derivatives are

∂Y1 ∂Y1 = −x1 , = a2 , . . . ∂a1 x2 ,... ∂x2 a1 ,...

(3.1.21 a)

(3.1.21 b)

In this manner, one can obtain 8 thermodynamic potentials corresponding to the three pairs of variables. Table 3.1 collects the most important of these, i.e. the ones already introduced above. 1

We make an additional remark here about the geometric signiﬁcance of the Legendre transformation, referring to the case of a single variable: a curve can be represented either as a series of points Y = Y (x1 ), or through the family of its envelopes. In the latter representation, the intercepts of the tangential envelope lines on the ordinate as a function of their slopes a1 are required. This geometric meaning of the Legendre transformation is the basis of the construction of G(T, P ) from F (T, V ) shown in Fig. 3.33. [If one simply eliminated x1 in Y = Y (x1 ) in favor of a1 , then one would indeed obtain Y as a function of a1 , but it would no longer be possible to reconstruct Y (x1 )].

80

3. Thermodynamics

Table 3.1. Energy, entropy, and thermodynamic potentials Independent variables

State function

Diﬀerentials P

Energy E

S, V, {Nj }

dE = T dS − P dV +

Entropy S

E, V, {Nj }

dS =

Free Energy F = E − TS

T, V, {Nj }

dF = −SdT − P dV +

Enthalpy H = E + PV

S, P, {Nj }

dH = T dS + V dP +

Gibbs’ Free Energy G = E − TS + PV

T, P, {Nj }

dG = −SdT + V dP +

Grand Potential P Φ = E − T S − µj Nj

T, V, {µj }

dΦ = −SdT − P dV −

j

µj dNj

j

1 T

dE +

P T

dV −

P

µj T

j

P

dNj

µj dNj

j

P

µj dNj

j

P

µj dNj

j

P

Nj dµj

j

This table contains the generalization to systems with several components (see Sect. 3.9). Nj and µj are the particle number and the chemical potential of the j-th component. The previous formulas are found as a special case when the index j P and j are omitted.

F, H, G and Φ are called thermodynamic potentials, since taking their derivatives with respect to the natural independent variables leads to the conjugate variables, analogously to the derivation of the components of force from the potential in mechanics. For the entropy, this notation is clearly less useful, since entropy does not have the dimensions of an energy. E, F, H, G and Φ are related to each other through Legendre transformations. The natural variables are also termed canonical variables. In a system consisting of only one chemical substance with a ﬁxed number of particles, the state is completely characterized by specifying two quantities, e.g. T and V or V and P . All the other thermodynamic quantities can be calculated from the thermal and the caloric equations of state. If the state is characterized by T and V , then the pressure is given by the (thermal) equation of state P = P (T, V ) . (The explicit form for a particular substance is found from statistical mechanics.) If we plot P against T and V in a three-dimensional graph, we obtain the surface of the equation of state (or P V T surface); see Fig. 2.7 and below in Sect. 3.8.

3.1 Potentials and Laws of Equilibrium Thermodynamics

81

3.1.3 The Gibbs–Duhem Relation in Homogeneous Systems In this section, we will concentrate on the important case of homogeneous thermodynamic systems.2 Consider a system of this kind with the energy E, the volume V , and the particle number N . Now we imagine a second system which is completely similar in its properties but is simply larger by a factor α. Its energy, volume, and particle number are then αE, αV , and αN . Owing to the additivity of the entropy, it is given by S(αE, αV, αN ) = αS(E, V, N ) .

(3.1.23)

As a result, the entropy S is a homogeneous function of ﬁrst order in E, V and N . Correspondingly, E is a homogeneous function of ﬁrst order in S, V and N . There are two types of state variables: E, V, N, S, F, H, G, and Φ are called extensive, since they are proportional to α when the system is enlarged as described above. T, P, and µ are intensive, since they are independent of α; e.g. we ﬁnd T −1 =

∂αS ∂S = ∼ α0 , ∂E ∂αE

and this independence follows in a similar manner from the deﬁnitions of the other intensive variables, also. We wish to investigate the consequences of the homogeneity of S [Eq. (3.1.23)]. To this end, we diﬀerentiate (3.1.23) with respect to α and then set α = 1: ∂S ∂S ∂S E+ V + N =S. ∂αE ∂αV ∂αN α=1 From this, we ﬁnd using (3.1.2) that −S + E = T S − P V + µN .

1 TE

+

P T

V −

µ TN

= 0, that is (3.1.24)

This is the Gibbs–Duhem relation. Together with dE = T dS − P dV + µdN , we derive from Eq. (3.1.24) SdT − V dP + N dµ = 0 ,

(3.1.24 )

the diﬀerential Gibbs–Duhem relation. It states that in a homogeneous system, T, P and µ cannot be varied independently, and it gives the relationship between the variations of these intensive quantities.3 The following expressions can be derived from the Gibbs–Duhem relation: 2

3

Homogeneous systems have the same speciﬁc properties in all spatial regions; they may also consist of several types of particles. Examples of inhomogeneous systems are those in a position-dependent potential and systems consisting of several phases which are in equilibrium, although in this case the individual phases can still be homogeneous. The generalization to systems with several components is given in Sect. 3.9, Eq. (3.9.7).

82

3. Thermodynamics

G(T, P, N ) = µ(T, P ) N

(3.1.25)

and Φ(T, V, µ) = −P (T, µ) V .

(3.1.26)

Justiﬁcation: from the deﬁnition (3.1.13), it `follows (3.1.24) that ´ immediately ´ ` ∂µusing ∂G G = µN , and from (3.1.15) we ﬁnd µ = ∂N = µ + N ; it follows ∂N T,P T,P that µ must be independent of N . We have thus demonstrated (3.1.25). Similarly, ` ∂Φ ´ , P must be it follows from (3.1.16) that Φ = −P V , and due to −P = ∂V T,µ independent of V . Further conclusions following from homogeneity (in the canonical ensemble with independent variables T, V, and N ) can be obtained starting with P (T, V, N ) = P (T, αV, αN )

and

µ(T, V, N ) = µ(T, αV, αN )

(3.1.27a,b)

again by taking derivatives with respect to α around the point α = 1: „

∂P ∂V

„

« V + T,N

∂P ∂N

„

« N =0 T,V

and

∂µ ∂V

„

« V + T,N

∂µ ∂N

« N =0. T,V

(3.1.28a,b) These two relations merely state that for intensive quantities, a volume increase is equivalent to a decrease in the number of particles.

3.2 Derivatives of Thermodynamic Quantities 3.2.1 Deﬁnitions In this section, we will deﬁne the most important thermodynamic derivatives. In the following deﬁnitions, the particle number is always held constant. The heat capacity is deﬁned as C=

δQ dS =T . dT dT

(3.2.1)

It gives the quantity of heat which is required to raise the temperature of a body by 1 K. We still have to specify which thermodynamic variables are held constant during this heat transfer. The most important cases are that the volume or the pressure is held constant. If the heat is transferred at constant volume, the heat capacity at constant volume is relevant:

∂S ∂E CV = T = . (3.2.2a) ∂T V,N ∂T V,N

3.2 Derivatives of Thermodynamic Quantities

83

In rearranging (∂S/∂T )V,N , we have used Eq. (3.1.1). If the heat transfer takes place under constant pressure, then the heat capacity at constant pressure from (3.2.1) must be used:

∂S ∂H CP = T = . (3.2.2b) ∂T P,N ∂T P,N For the rearrangement of the deﬁnition, we employed (3.1.11). If we divide the heat capacity by the mass of the substance or body, we obtain the speciﬁc heat, in general denoted as c, or cV at constant volume or cP at constant pressure. The speciﬁc heat is measured in units of J kg−1 K−1 . The speciﬁc heat may also be referred to 1 g and quoted in the (non-SI) units cal g−1 K−1 . The molar heat capacity (heat capacity per mole) gives the heat capacity of one mole of the substance. It is obtained from the speciﬁc heat referred to 1 g, multiplied by the molecular weight of the substance. Remark: We will later show in general using Eq. (3.2.24) that the speciﬁc heat at constant pressure is larger than that at constant volume. The physical origin of this diﬀerence can be readily seen by writing Law the ` ∂E ´ for constant N in ´ form ` ´the First` ∂E dT + ∂V T dV = CV dT + ∂V dV , δQ = dE + P dV and setting dE = ∂E ∂T V T that is » „ « – ∂E dV . δQ = CV dT + P + ∂V T In addition to the quantity of heat CV dT necessary for warming at constant volume, when V is increased, more heat is consumed by the work against the pressure, P dV , ` ´ , it then and by the change in the internal energy, (∂E/∂V )T dV . For CP = δQ dT P follows from the last relation that « «„ « „ „ ∂V ∂E CP = C V + P + . ∂V T ∂T P

Further important thermodynamic derivatives are the compressibility, the coeﬃcient of thermal expansion, and the thermal pressure coeﬃcient. The compressibility is deﬁned in general by κ=−

1 dV . V dP

It is a measure of the relative volume decrease on increasing the pressure. For compression at a constant temperature, the isothermal compressibility, deﬁned by

1 ∂V κT = − (3.2.3a) V ∂P T,N is the relevant quantity. For (reversible) processes in which no heat is transferred, i.e. when the entropy remains constant, the adiabatic (isentropic) compressibility

84

3. Thermodynamics

κS = −

1 V

∂V ∂P

(3.2.3b) S,N

must be introduced. The coeﬃcient of thermal expansion is deﬁned as

1 ∂V α= . (3.2.4) V ∂T P,N The deﬁnition of the thermal pressure coeﬃcient is given by

1 ∂P β= . P ∂T V,N

(3.2.5)

Quantities such as C, κ, and α are examples of so-called susceptibilities. They indicate how strongly an extensive quantity varies on changing (increasing) an intensive quantity. 3.2.2 Integrability and the Maxwell Relations 3.2.2.1 The Maxwell Relations The Maxwell relations are expressions relating the thermodynamic derivatives; they follow from the integrability conditions. From the total diﬀerential of the function Y = Y (x1 , x2 ) dY = a1 dx1 + a2 dx2 ,

∂Y ∂Y a1 = , a2 = ∂x1 x2 ∂x2 x1

(3.2.6)

we ﬁnd as a result of the commutatitivity of the order of the derivatives, ∂a2 ∂a1 ∂2 Y ∂2Y ∂x2 x = ∂x2 ∂x1 = ∂x1 ∂x2 = ∂x1 x the following integrability condition: 1

∂a1 ∂x2

2

= x1

∂a2 ∂x1

.

(3.2.7)

x2

All together, there are 12 diﬀerent Maxwell relations. The relations for ﬁxed N are:

∂P ∂S ∂P ∂T =− , F : = (3.2.8a,b) E: ∂V S ∂S V ∂V T ∂T V

∂T ∂V ∂S ∂P H: = or = (3.2.9) ∂P S ∂S P ∂V P ∂T S

∂V ∂S =− = −V α . (3.2.10) G: ∂P T ∂T P

3.2 Derivatives of Thermodynamic Quantities

85

Here, we have labeled the Maxwell relations with the quantity from whose differential the relation is derived. There are also relations containing N and µ; of these, we shall require the following in this book:

∂µ ∂P F : =− . (3.2.11) ∂V T,N ∂N T,V Applying this relation to homogeneous systems, we ﬁnd from (3.1.28a) and (3.1.28b):

∂µ V V ∂P ∂µ =− = ∂N T,V N ∂V T,N N ∂N T,V (3.2.12)

V 2 ∂P V 1 =− 2 = 2 . N ∂V T,N N κT ∗

3.2.2.2 Integrability Conditions, Exact and Inexact Diﬀerentials

It may be helpful at this point to show the connection between the integrability conditions and the results of vector analysis as they apply to classical mechanics. We consider a vector ﬁeld F(x), which is deﬁned within the simply-connected region G (this ﬁeld could for example be a force ﬁeld). Then the following statements are equivalent: F(x) = −∇V (x) x with V (x) = − x0 dx F(x ), where x0 is an arbitrary ﬁxed point of origin and the line integral is to be taken along an arbitrary path from x0 to x. This means that F(x) can be derived from a potential. (I)

(II) (III) (IV)

curl F = 0 dx F(x) = 0 x2 x1 dx F(x)

at each point in G. along each closed path in G. is independent of the path.

Let us return to thermodynamics. We consider a system characterized by two independent thermodynamic variables x and y and a quantity whose diﬀerential variation is given by dY = A(x, y)dx + B(x, y)dy .

(3.2.13)

In the notation of mechanics, F = (A(x, y), B(x, y), 0). The existence of a state variable Y , i.e. a state function Y (x, y) (Statement (I )) is equivalent to each of the three other statements (II ,III , and IV ). (I ) (II ) (III ) (IV )

A state function Y (x, y) exists, with (x,y) Y (x, y) = Y (x0 , y0 ) + (x0 ,y0 ) dx A(x , y ) + dy B(x , y ) . ∂B ∂A ∂x y = ∂y x dxA(x, y) + dyB(x, y) = 0 P1 P0 dxA(x, y) + dyB(x, y) is independent of the path.

86

3. Thermodynamics

Fig. 3.2. Illustrating the path integrals III and IV

The diﬀerential (3.2.13) is called an exact diﬀerential (or a perfect diﬀerential) when the coeﬃcients A and B fulﬁll the integrability condition (II ).

3.2.2.3 The Non-integrability of δQ and δW We can now prove that δQ and δW are not integrable. We ﬁrst consider δW and imagine the independent thermodynamic variables to be V and T . Then the relation (3.1.4a) becomes δW = −P dV + 0 · dT .

(3.2.14)

The derivative ofthe pressure with respect to the temperature at constant volume is nonzero, ∂P ∂T V = 0, while of course the derivative of zero with respect to V gives zero. That is, the integrability condition is not fulﬁlled. Analogously, we write (3.1.5) in the form δQ = T dS + 0 · dV . ∂T

(3.2.15)

∂S )T ( ∂V ( ∂P ∂T )V = − = 0, i.e. the integrability ∂S ∂S ∂V S ( ∂T )V ( ∂T )V condition is not fulﬁlled. Therefore, there are no state functions W (V, T, N ) and Q(V, T, N ) whose diﬀerentials are equal to δW and δQ. This is the reason for the diﬀerent notation used in the diﬀerential signs. The expressions relating the heat transferred to the system and the work performed on it to the state variables exist only in diﬀerential form. One can, of course, compute the integral 1 δQ = 1 T dS along a given path (e.g. 1 in Fig. 3.2), and similarly for δW , but the values of these integrals depend not only on their starting and end points, but also on the details of the path which connects those points.

Again, we have

= −

Remark: In the case that a diﬀerential does not fulﬁll the integrability condition, δY = A(x, y)dx + B(x, y)dy ,

3.2 Derivatives of Thermodynamic Quantities

87

but can be converted into an exact diﬀerential through multiplication by a factor g(x, y), then g(x, y) is termed an integrating factor. Thus, T1 is an integrating factor for δQ. In statistical mechanics, it is found quite naturally that the entropy is a state function, i.e. dS is an exact diﬀerential. In the historical development of thermodynamics, it was a decisive and nontrivial discovery that multiplication of δQ by T1 yields an exact diﬀerential. 3.2.3 Jacobians It is often necessary to transform from one pair of thermodynamic variables to a diﬀerent pair. For the necessary recalculation of the thermodynamic derivatives, it is expedient to use Jacobians. In the following, we consider functions of two variables: f (u, v) and g(u, v). We deﬁne the Jacobian determinant: ∂f

∂f ∂(f, g) ∂u v ∂v u ∂f ∂g ∂g ∂f = = ∂g − . (3.2.16) ∂g ∂(u, v) ∂u ∂v ∂v ∂u v u u v ∂u

∂v

v

u

This Jacobian fulﬁlls a series of important relations. Let u = u(x, y) and v = v(x, y) be functions of x and y; then the following chain rule can be proved in an elementary fashion: ∂(f, g) ∂(f, g) ∂(u, v) = . ∂(x, y) ∂(u, v) ∂(x, y)

(3.2.17)

This relation is important for the changes of variables which are frequently needed in thermodynamics. Setting g = v, the deﬁnition (3.2.16) is simpliﬁed to

∂(f, v) ∂f = . (3.2.18) ∂(u, v) ∂u v Since a determinant changes its sign on interchanging two columns, we have ∂(f, g) ∂(f, g) =− . ∂(v, u) ∂(u, v)

(3.2.19)

If we apply the chain rule (3.2.17) for x = f and y = g, we ﬁnd: ∂(f, g) ∂(u, v) =1. ∂(u, v) ∂(f, g) Setting g = v in (3.2.20), we obtain with (3.2.18)

∂f 1 = . ∂u ∂u v ∂f

v

(3.2.20)

(3.2.20 )

88

3. Thermodynamics

Finally, from (3.2.18) we have

∂f ∂u

v

∂f

∂v ∂(f, v) ∂(f, u) ∂(f, v) = = − ∂u u . = ∂(u, v) ∂(f, u) ∂(u, v) ∂v f

(3.2.21)

Using this relation, one can thus transform a derivative at constant v into derivatives at constant u and f . The relations given here can also be applied to functions of more than two variables, provided the additional variables are held constant. 3.2.4 Examples (i) We ﬁrst derive some useful relations between the thermodynamic derivatives. Using Eqns. (3.2.21), (3.2.3a), and (3.2.4), we obtain ∂V

∂P α ∂T P = = − ∂V . (3.2.22) ∂T V κ T ∂P T Thus, the thermal pressure coeﬃcient β = P1 ∂P ∂T V [Eq. (3.2.5)] is related to the coeﬃcient of thermal expansion α and the isothermal compressibility κT . In problem 3.4, it is shown that CP κT = CV κS

(3.2.23)

[cf. (3.2.3a,b)]. Furthermore, we see that ∂ (S, V ) ∂ (T, P ) ∂ (S, V ) =T = ∂ (T, V ) ∂ (T, P ) ∂ (T, V )

∂P ∂S ∂V ∂V ∂S =T = − ∂V T ∂T P ∂P T ∂P T ∂T P ∂S ∂V ∂V 2

CV = T

= CP − T

∂P T ∂T ∂V ∂P T

P

∂T P . = CP + T ∂V ∂P T

Here, the Maxwell relation (3.2.10) was used. Thus we ﬁnd for the heat capacities CP − CV =

T V α2 . κT

(3.2.24)

With κT CP − κT CV = T V α2 and κT CV = κS CP , it follows that the compressibilities obey the relation κT − κ S =

T V α2 . CP

(3.2.25)

It follows from (3.2.24) that the two heat capacities can become equal only when the coeﬃcient of expansion α vanishes or κT becomes very large. The former occurs in the case of water at 4◦ C.

3.3 Fluctuations and Thermodynamic Inequalities

89

(ii) We now evaluate the thermodynamic derivatives for the classical ideal gas, based on Sect. 2.7 . For the enthalpy H = E + P V , it follows from Eqns. (2.7.25) and (2.7.28) that H=

5 N kT . 2

(3.2.26)

Then, for the heat capacities, we ﬁnd

3 5 ∂E ∂H CP = CV = = Nk , = Nk ; ∂T V 2 ∂T P 2 and for the compressibilities,

1 ∂V 1 , κT = − = V ∂P T P

κS = κT

CV 3 , = CP 5P

(3.2.27)

(3.2.28)

ﬁnally, for the thermal expansion coeﬃcient and the thermal pressure coeﬃcient, we ﬁnd

1 ∂V 1 ∂P 1 1 α 1 α= and β = . (3.2.29a,b) = = = V ∂T P T P ∂T V P κT T

3.3 Fluctuations and Thermodynamic Inequalities This section is concerned with ﬂuctuations of the energy and the particle number, and belongs contextually to the preceding chapter. We are only now treating these phenomena because the ﬁnal results are expressed in terms of thermodynamic derivatives, whose deﬁnitions and properties are only now at our disposal. 3.3.1 Fluctuations 1. We consider a canonical ensemble, characterized by the temperature T , the volume V , the ﬁxed particle number N , and the density matrix ρ=

e−βH , Z

Z = Tr e−βH .

The average value of the energy [Eq. (2.6.37)] is given by ¯ = 1 Tr e−βH H = 1 ∂Z . E Z Z ∂(−β) Taking the temperature derivative of (3.3.1),

¯ ¯ 1 1 ∂E ∂E 1 2 2 H − H = = (∆E)2 , = 2 2 ∂T V kT ∂(−β) kT kT 2

(3.3.1)

90

3. Thermodynamics

we obtain after substitution of (3.2.2a) the following relation between the speciﬁc heat at constant volume and the mean square deviation of the internal energy: CV =

1 (∆E)2 . kT 2

(3.3.2)

2. Next, we start with the grand canonical ensemble, characterized by T, V, µ, and the density matrix −1 −β(H−µN ) ρG = Z G e ,

ZG = Tr e−β(H−µN ) .

The average particle number is given by ¯ = Tr ρG N = kT Z −1 ∂ZG . N G ∂µ

(3.3.3)

Its derivative with respect to the chemical potential is

¯ ∂N ¯ 2 = β(∆N )2 . = β N2 − N ∂µ T,V If we replace the left side by (3.2.12), we obtain the following relation between the isothermal compressibility and the mean square deviation of the particle number:

∂N 1 ∂V V V κT = − = 2 = 2 β(∆N )2 . (3.3.4) V ∂P T,N N ∂µ T,V N Eqns. (3.3.2) and (3.3.4) are fundamental examples of relations between susceptibilities (on the left-hand sides) and ﬂuctuations, so called ﬂuctuationresponse theorems. 3.3.2 Inequalities From the relations derived in 3.3.1, we derive (as a result of the positivity of the ﬂuctuations) the following inequalities: κT ≥ 0 ,

(3.3.5)

CP ≥ CV ≥ 0 .

(3.3.6)

In (3.3.6), we have used the fact that according to (3.2.24) and (3.3.5), CP is larger than CV . On decreasing the volume, the pressure increases. On increasing the energy, the temperature increases. The validity of these inequalities is a precondition for the stability of matter. If, for example, (3.3.5) were not valid, compression of the system would decrease its pressure; it would thus be further compressed and would ﬁnally collapse.

3.4 Absolute Temperature and Empirical Temperatures

91

3.4 Absolute Temperature and Empirical Temperatures The absolute temperature was deﬁned in (2.4.4) as T −1 =

∂S(E,V,N ) ∂E

. V,N

Experimentally, one uses a temperature ϑ, which is for example given by the length of a rod or a column of mercury, or the volume or the pressure of a gas thermometer. We assume that the empirical temperature ϑ increases monotonically with T , i.e. that ϑ also increases when we put heat into the system. We now seek a method of determining the absolute temperature from ϑ, that is, we seek the relation T = T (ϑ). To this end, we start with the thermodynamic difference quotient δQ : dP

δQ dP

=T

T

∂S ∂P

= −T T

∂V ∂T

T

= −T P

∂V ∂ϑ

P

dϑ . dT

(3.4.1)

Here, we have substituted in turn δQ = T dS, the Maxwell relation (3.2.10), and T = T (ϑ). It follows that ∂V

1 dT dP ∂V = − ∂ϑ P = − . (3.4.2) δQ T dϑ ∂ϑ P δQ ϑ dP

T

This expression is valid for any substance. The right-hand side can be measured experimentally and yields a function of ϑ. Therefore, (3.4.2) represents an ordinary inhomogeneous diﬀerential equation for T (ϑ), whose integration yields T = const · f (ϑ) .

(3.4.3)

We thus obtain a unique relation between the empirical temperature ϑ and the absolute temperature. The constant can be chosen freely due to the arbitrary nature of the empirical temperature scale. The absolute temperature scale is determined by deﬁning the triple point of water to be Tt = 273.16 K. For magnetic thermometers, it follows from (cf. Chap. 6), analogously, „ « „ « 1 dT ∂M dB = . T dϑ ∂ϑ B δQ ϑ

The absolute temperature

−1 ∂S T = ∂E V,N

` δQ ´ dB T

= T

` ∂S ´ ∂B T

= T

` ∂M ´ ∂T

B

(3.4.4)

(3.4.5)

is positive, since the number of accessible states (∝ Ω(E)) is a rapidly increasing function of the energy. The minimum value of the absolute temperature

92

3. Thermodynamics

is T = 0 (except for systems which have energetic upper bounds, such as an assembly of paramagnetic spins). This follows from the distribution of energy levels E in the neighborhood of the ground-state energy E0 . We can see from the models which we have already evaluated explicitly (quantum-mechanical harmonic oscillators, paramagnetic moments: Sects. 2.5.2.1 and 2.5.2.2) that limE→E0 S (E) = ∞, and thus for these systems, which are generic with respect to their low-lying energy levels, lim T = 0 .

E→E0

We return once more to the determination of the temperature scale through Eq. (3.4.3) in terms of Tt = 273.16 K. As mentioned in Sect. 2.3, the value of the Boltzmann constant is also ﬁxed by this relation. In order to see this, we consider a system whose equation of state at Tt is known. Molecular hydrogen can be treated as an ideal gas at Tt and P = 1 atm. The density of H2 under these conditions is ρ = 8.989 × 10−2 g/liter = 8.989 × 10−5 g/cm−3 . Its molar volume then has the value 2.016 g VM = = 22.414 liters . 8.989 × 10−2 g liters−1 One mole is deﬁned as: 1 mole corresponds to a mass equal to the atomic weight in g (e.g. a mole of H2 has a mass of 2.016 g). From this fact, we can determine the Boltzmann constant : PV 1 atm VM = = 1.38066 × 10−16 erg/K NT NA × 273.16 K = 1.38066 × 10−23 J/K .

k=

(3.4.6)

Here, Avogadro’s number was used: NA ≡ number of molecules per mole 2.016 g 2.016 g = 6.0221 × 1023 mol−1 . = = mass of H2 2 × 1.6734 × 10−24 g Further deﬁnitions of units and constants, e.g. the gas constant R, are given in Appendix I.

3.5 Thermodynamic Processes In this section, we want to treat thermodynamic processes, i.e. processes which either during the whole course of their time development or at least in their initial or ﬁnal stages can be suﬃciently well described by thermodynamics.

3.5 Thermodynamic Processes

93

3.5.1 Thermodynamic Concepts We begin by introducing several concepts of thermodynamics which we will later use repeatedly (cf. Table 3.2). Processes in which the pressure is held constant, i.e. P = const, are called isobaric; those in which the volume remains constant, V = const, are isochoral; those in which the entropy is constant, S = const, are isentropic; and those in which no heat is transferred, i.e. δQ = 0, are termed adiabatic (thermally isolated). Table 3.2. Some thermodynamic concepts Concept isobaric isochoral isothermal isentropic adiabatic extensive intensive

Deﬁnition P = const. V = const. T = const. S = const. δQ = 0 proportional to the size of the system independent of the size of the system

We mention here another deﬁnition of the terms extensive and intensive, which is equivalent to the one given in the section on the Gibbs–Duhem relation. We divide a system that is characterized by the thermodynamic variable Y into two parts, which are themselves characterized by Y1 and Y2 . In the case that Y1 + Y2 = Y , Y is called extensive; when Y1 = Y2 = Y , it is termed intensive (see Fig. 3.3).

Fig. 3.3. The deﬁnition of extensive and intensive thermodynamic variables

Extensive variable include: V, N, E, S, the thermodynamic potentials, the electric polarization P, and the magnetization M. Intensive variables include: P, µ, T , the electric ﬁeld E, and the magnetic ﬁeld B. Quasistatic process: a quasistatic process takes place slowly with respect to the characteristic relaxation time of the system, i.e. the time within which

94

3. Thermodynamics

the system passes from a nonequilibrium state to an equilibrium state, so that the system remains in equilibrium at each moment during such a process. Typical relaxation times are of the order of τ = 10−10 − 10−9 sec. An irreversible process is one which cannot take place in the reverse direction, e.g. the transition from a nonequilibrium state to an equilibrium state (the initial state could also be derived from an equilibrium state with restrictions by lifting of those restrictions). Experience shows that a system which is not in equilibrium moves towards equilibrium; in this process, its entropy increases. The system then remains in equilibrium and does not return to the nonequilibrium state. Reversible processes: reversible processes are those which can also occur in the reverse direction. An essential attribute of reversibility is that a process which takes place in a certain direction can be followed by the reverse process in such a manner that no changes in the surroundings remain. The characterization of a thermodynamic state (with a ﬁxed particle number N ) can be accomplished by specifying two quantities, e.g. T and V , or P and V . The remaining quantities can be found from the thermal and the caloric equations of state. A system in which a quasistatic process is occurring, i.e. which is in thermal equilibrium at each moment in time, can be represented by a curve, for example in a P –V diagram (Fig. 2.7b). A reversible process must in all cases be quasistatic. In non-quasistatic processes, turbulent ﬂows and temperature ﬂuctuations take place, leading to the irreversible production of heat. The intermediate states in a nonquasistatic process can furthermore not be suﬃciently characterized by P and V . One requires for their characterization more degrees of freedom, or in other words, a space of higher dimensionality. There are also quasistatic processes which are irreversibe (e.g. temperature equalization via a poor heat conductor, 3.6.3.1; or a Gay-Lussac experiment carried out slowly, 3.6.3.6). Even in such processes, equilibrium thermodynamics is valid for the individual components of the system. Remark: We note that thermodynamics rests on equilibrium statistical mechanics. In reversible processes, the course of events is so slow that the system is in equilibrium at each moment; in irreversible processes, this is true of at least the initial and ﬁnal states, and thermodynamics can be applied to these states. In the following sections, we will clarify the concepts just introduced on the basis of some typical examples. In particular, we will investigate how the entropy changes during the course of a process.

3.5 Thermodynamic Processes

95

3.5.2 The Irreversible Expansion of a Gas; the Gay-Lussac Experiment (1807) The Gay-Lussac experiment4 deals with the adiabatic expansion of a gas and is carried out as follows: a container of volume V which is insulated from its surroundings is divided by partition into two subvolumes, V1 and V2 . Initially, the volume V1 contains a gas at a temperature T , while V2 is evacuated. The partition is then removed and the gas ﬂows rapidly into V2 (Fig. 3.4).

Fig. 3.4. The Gay-Lussac experiment

After the gas has reached equilibrium in the whole volume V = V1 + V2 , its thermodynamic quantities are determined. We ﬁrst assume that this experiment is carried out using an ideal gas. The initial state is completely characterized by its volume V1 and the temperature T . The entropy and the pressure before the expansion are, from (2.7.27) and (2.7.25), given by

5 V1 /N N kT S = Nk + log and P = , 2 λ3 V1 with the thermal wavelength λ: h λ= √ . 2πmkT In the ﬁnal state, the volume is now V = V1 + V2 . The temperature is still equal to T , since the energy remains constant and the caloric equation of state of ideal gases, E = 32 kT N , contains no dependence on the volume. The entropy and the pressure after the expansion are:

V /N 5 N kT + log 3 . , P = S = N k 2 λ V We can see that in this process, there is an entropy production of ∆S = S − S = N k log 4

V >0. V1

(3.5.1)

Louis Joseph Gay-Lussac, 1778–1850. The goal of Gay-Lussac’s experiments was to determine the volume dependence of the internal energy of gases.

96

3. Thermodynamics

It is intuitively clear that the process is irreversible. Since the entropy increases and no heat is transferred, (δQ = 0), the mathematical criterion for an irreversible process, Eq. (3.6.8) (which remains to be proved), is fulﬁlled. The initial and ﬁnal states in the Gay-Lussac experiment are equilibrium states and can be treated with equilibrium thermodynamics. The intermediate states are in general not equilibrium states, and equilibrium thermodynamics can therefore make no statements about them. Only when the expansion is carried out as a quasistatic process can equilibrium thermodynamics be applied at each moment. This would be the case if the expansion were carried out by allowing a piston to move slowly (either by moving a frictionless piston in a series of small steps without performing work, or by slowing the expansion of the gas by means of the friction of the piston and transferring the resulting frictional heat back into the gas). For an arbitrary isolated gas, the temperature change per unit volume at constant energy is given by

∂T ∂V

∂E

= E

∂V T − ∂E ∂T V

=−

T

∂S ∂V

T

CV

−P

=

1 CV

∂P P −T , (3.5.2a) ∂T V

∂S where the Maxwell relation ∂V = ∂P ∂T V has been employed. This coT eﬃcient has the value 0 for an ideal gas, but for real gases it can have either a positive or a negative sign. The entropy production is, owing to dE = T dS − P dV = 0, given by

∂S ∂V

= E

P >0, T

(3.5.2b)

i.e. dS > 0. Furthermore, no heat is exchanged with the surroundings, that is, δQ = 0. Therefore, it follows that the inequality between the change in the entropy and the quantity of heat transferred T dS > δQ

(3.5.3)

holds here. The coeﬃcients calculated from equilibrium thermodynamics (3.5.2a,b) can be applied to the whole course of the Gay-Lussac experiment if the process is carried out in a quasistatic manner. Yet it remains an irreversible process! By integration of (3.5.2a,b), one obtains the diﬀerences in temperature and entropy between the ﬁnal and initial states. The result can by the way also be applied to the non-quasistatic irreversible process, since the two ﬁnal states are identical. We shall return to the quasistatic, irreversible Gay-Lussac experiment in 3.6.3.6.

3.5 Thermodynamic Processes

97

3.5.3 The Statistical Foundation of Irreversibility How irreversible is the Gay-Lussac process? In order to understand why the Gay-Lussac experiment is irreversible, we consider the case that the volume increase δV fulﬁlls the inequality δV V , where V now means the initial volume (see Fig. 3.5).

Fig. 3.5. Illustration of the Gay-Lussac experiment

In the expansion from V to V + δV , the phase-space surface changes from Ω(E, V ) to Ω(E, V + δV ), and therefore the entropy changes from S(E, V ) to S(E, V + δV ). After the gas has carried out this expansion, we ask what the probability would be of ﬁnding the system in only the subvolume V . Employing (1.3.2), (2.2.4), and (2.3.4), we ﬁnd this probability to be given by Ω(E, V ) dq dp δ(H − E) W (E, V ) = = = (3.5.4) 3N N ! h Ω(E, V + δV ) Ω(E, V + δV ) V

= e−(S(E,V +δV )−S(E,V ))/k = = e−( ∂V )E δV /k = e− T δV /k = e− V ∂S

P

δV

N

1,

where in the last rearrangement, we have assumed an ideal gas. Due to the factor N ≈ 1023 in the exponent, the probability that the system will return spontaneously to the volume V is vanishingly small. In general, it is found that for the probability, a constraint (a restriction C) occurs spontaneously: W (E, C) = e−(S(E)−S(E,C))/k .

(3.5.5)

We ﬁnd that S(E, C) S(E), since under the constraint, fewer states are accessible. The diﬀerence S(E) − S(E, C) is macroscopic; in the case of the change in volume, it was proportional to N δV /V , and the probability W (E, C) ∼ e−N is thus practically zero. The transition from a state with a constraint C to one without this restriction is irreversible, since the probability that the system will spontaneously search out a state with this constraint is vanishingly small.

98

3. Thermodynamics

3.5.4 Reversible Processes In the ﬁrst subsection, we consider the reversible isothermal and adiabatic expansion of ideal gases, which illustrate the concept of reversibility and are important in their own right as elements of thermodynamic processes. 3.5.4.1 Typical Examples: the Reversible Expansion of a Gas In the reversible expansion of an ideal gas, work is performed on a spring by the expanding gas and energy is stored in the spring (Fig. 3.6). This energy can later be used to compress the gas again; the process is thus reversible. It can be seen as a reversible variation of the Gay-Lussac experiment. Such a process can be carried out isothermally or adiabatically.

Fig. 3.6. The reversible isothermal expansion of a gas, where the work performed is stored by a spring. The work performed by the gas is equal to the area below the isotherm in the P − V diagram.

a) Isothermal Expansion of a Gas, T = const. We ﬁrst consider the isothermal expansion. Here, the gas container is in a heat bath at a temperature T . On expansion from the initial volume V1 to the ﬁnal volume V , the gas performs the work:5 V W=

V P dV =

V1

dV

V N kT = N kT log . V V1

(3.5.6)

V1

This work can be visualized as the area below the isotherm in the P − V diagram (Fig. 3.6). Since the temperature remains constant, the energy of the ideal gas is also unchanged. Therefore, the heat bath must transfer a quantity of heat 5

We distinguish the work performed by the system (W) from work performed on the system (W ), we use diﬀerent symbols, implying opposite signs: W = −W .

3.5 Thermodynamic Processes

Q=W

99

(3.5.7)

to the system. The change in the entropy during this isothermal expansion is given according to (2.7.27) by: ∆S = N k log

V . V1

(3.5.8)

Comparison of (3.5.6) with (3.5.8) shows us that the entropy increase and the quantity of heat taken up by the system here obey the following relation: ∆S =

Q . T

(3.5.9)

This process is reversible, since using the energy stored in the spring, one could compress the gas back to its original volume. In this compression, the gas would release the quantity of heat Q to the heat bath. The ﬁnal state of the system and its surroundings would then again be identical to their original state. In order for the process to occur in a quasistatic way, the strength of the spring must be varied during the expansion or compression in such a way that it exactly compensates the gas pressure P (see the discussion in Sect. 3.5.4.2). One could imagine the storage and release of the energy from the work of compression or expansion in an idealized thought experiment to be carried out by the horizontal displacement of small weights, which would cost no energy. We return again to the example of the irreversible expansion (Sect. 3.5.2). Clearly, by performing work in this case we could also compress the gas after its expansion back to its original volume, but then we would increase its energy in the process. The work required for this compression is ﬁnite and its magnitude is proportional to the change in volume; it cannot, in contrast to the case of reversible processes, in principle be made equal to zero. b) Adiabatic Expansion of a Gas, ∆Q = 0 We now turn to the adiabatic reversible expansion. In contrast to Fig. 3.6, the gas container is now insulated from its surroundings, and the curves in the P -V diagram are steeper. In every step of the process, δQ = 0, and since work is here also performed by the gas on its surroundings, it cools on expansion. It then follows from the First Law that dE = −P dV . If we insert the caloric and the thermal equations of state into this equation, we ﬁnd: dT 2 dV =− . T 3 V

(3.5.10)

Integration of the last equation leads to the two forms of the adiabatic equation for an ideal gas:

100

3. Thermodynamics

2/3 T = T1 V1 /V

2/3

and P = N kT1 V1

V −5/3 ,

(3.5.11a,b)

where the equation of state was again used to obtain b. We now once more determine the work W(V ) performed on expansion from V1 to V . It is clearly less than in the case of the isothermal expansion, since no heat is transferred from the surroundings. Correspondingly, the area beneath the adiabats is smaller than that beneath the isotherms (cf. Fig. 3.7). Inserting Eq. (3.5.11b) yields for the work:

Fig. 3.7. An isotherm and an adiabat passing through the initial point (P1 , V1 ), with P1 = N kT1 /V1

V W(V ) =

−2/3 V 3 ; dV P = N kT1 1 − 2 V1

(3.5.12)

V1

geometrically, this is the area beneath the adiabats, Fig. 3.7. The change in the entropy is given by

V λ31 ∆S = N k log =0, (3.5.13) λ3 V1 and it is equal to zero. We are dealing here with a reversible process in an isolated systems, (∆Q = 0), and ﬁnd ∆S = 0, i.e. the entropy remains unchanged. This is not surprising, since for each inﬁnitesimal step in the process, T dS = δQ = 0

(3.5.14)

holds. ∗

3.5.4.2 General Considerations of Real, Reversible Processes

We wish to consider to what extent the situation of a reversible process can indeed be realized in practice. If the process can occur in both directions, what decides in which direction it in fact proceeds? To answer this question, in Fig. 3.8 we consider a process which takes place between the points 1 and 2.

3.5 Thermodynamic Processes

101

Fig. 3.8. A reversible process. P is the internal pressure of the system (solid line). Pa is the external pressure produced by the spring (dashed line).

The solid curve can be an isotherm or a polytrope (i.e. an equilibrium curve which lies between isotherms and adiabats). Along the path from 1 to 2, the working substance expands, and from 2 to 1, is is compressed again, back to its initial state 1 without leaving any change in the surroundings. At each moment, the pressure within the working substance is precisely compensated by the external pressure (produced here by a spring). This quasistatic reversible process is, of course, an idealization. In order for the expansion to occur at all, the external pressure PaEx must be somewhat lower than P during the expansion phase of the process. The external pressure is indicated in Fig. 3.8 by the dashed curve. This curve, which is supposed to characterize the real course of the process, is drawn in Fig. 3.8 as a dashed line, to indicate that a curve in the P − V diagram cannot fully characterize the system. In the expansion phase with Pa < P , the gas near the piston is somewhat rareﬁed. This eﬀectively reduces its pressure and the work performed by the gas is slightly less than would correspond to its actual pressure. Density gradients occur, i.e. there is a non-equilibriumstate. The 2 work obtained (which is stored as potential energy in the spring), 1 dV PaEx , then obeys the inequality 2

dV PaEx

1, and therefore for every substance, the slope of the adiabats, P = P (V, S = const.), is steeper than that of the isotherms, P = P (V, T = const.). ∂P For a classical ideal gas, we ﬁnd κ=const.6 and ∂V = − NVkT = − VP . 2 T It thus follows from (3.5.18)

∂P P . (3.5.20) = −κ ∂V S V The solution of this diﬀerential equation is P V κ = const , and with the aid of the equation of state, we then ﬁnd T V κ−1 = const . For a monatomic ideal gas, we have κ = of (3.2.27).

(3.5.21) 3 2 +1 3 2

= 53 , where we have made use

3.6 The First and Second Laws of Thermodynamics 3.6.1 The First and the Second Law for Reversible and Irreversible Processes 3.6.1.1 Quasistatic and in Particular Reversible Processes We recall the formulation of the First and Second Laws of Thermodynamics in Eqns. (3.1.3) and (3.1.5). In the case of reversible transitions between an equilibrium state and a neighboring, inﬁnitesimally close equilibrium state, we have dE = δQ − P dV + µdN

(3.6.1)

with δQ = T dS . 6

(3.6.2)

This is evident for a monatomic classical ideal gas from (3.2.27). For a molecular ideal gas as treated in Chap. 5, the speciﬁc heats are temperature independent only in those temperature regions where particular internal degrees of freedom are completely excited or not excited at all.

104

3. Thermodynamics

Equations (3.6.1) and (3.6.2) are the mathematical formulations of the First and Second Laws. The Second Law in the form of Eq. (3.6.2) holds for reversible (and thus necessarily quasistatic) processes. It is also valid for quasistatic irreversible processes within those subsystems which are in equilibrium at every instant in time and in which only quasistatic transitions from an equilibrium state to a neighboring equilibrium state take place. (An example of this is the thermal equilibration of two bodies via a poor heat conductor (see Sect. 3.6.3.1). The overall system is not in equilibrium, and the process is irreversible. However, the equilibration takes place so slowly that the two bodies within themselves are in equilibrium states at every moment in time). 3.6.1.2 Irreversible Processes For arbitrary processes, the First Law holds in the form given in Eq. (3.1.3 ): dE = δQ + δW + δEN ,

(3.6.1 )

where δQ, δW , and δEN are the quantity of heat transferred, the work performed on the system, and the increase in energy through addition of matter. In order to formulate the Second Law with complete generality, we recall the relation (2.3.4) for the entropy of the microcanonical ensemble and consider the following situation: we start with two systems 1 and 2 which are initially separated and are thus not in equilibrium with each other; their entropies are S1 and S2 . We now bring these two systems into contact. The entropy of this nonequilibrium state is Sinitial = S1 + S2 .

(3.6.3)

Suppose the two systems to be insulated from their environment and their total energy, volume, and particle number to be given by E, V and N . Now the overall system passes into the microcanonical equilibrium state corresponding to these macroscopic values. Owing to the additivity of entropy, the total entropy after equilibrium has been reached is given by ˜1 ) + S2 (E˜2 , V˜2 , N ˜2 ) , S1+2 (E, V, N ) = S1 (E˜1 , V˜1 , N

(3.6.4)

˜1 , V˜1 , N ˜1 (E ˜2 , V˜2 , N ˜2 ) are the most probable values of these quanwhere E tities in the subsystem 1 (2). Since the equilibrium entropy is a maximum (Eq. 2.3.5), the following inequality holds: S1 + S2 = Sinitial

(3.6.5)

˜1 ) + S2 (E˜2 , V˜2 , N ˜2 ) . ≤S1+2 (E, V, N ) = S1 (E˜1 , V˜1 , N Whenever the initial density matrix of the combined systems 1+2 is not already equal to the microcanonical density matrix, the inequality sign holds.

3.6 The First and Second Laws of Thermodynamics

105

We now apply the inequality (3.6.5) to various physical situations. (A) Let an isolated system be in a non-equilibrium state. We can decompose it into subsystems which are in equilibrium within themselves and apply the inequality (3.6.5). Then we ﬁnd for the change ∆S in the total entropy ∆S > 0 .

(3.6.6)

This inequality expresses the fact that the entropy of an isolated systems can only increase and is also termed the Clausius principle. (B) We consider two systems 1 and 2 which are in equilibrium within themselves but are not in equilibrium with each other. Let their entropy changes be denoted by ∆S1 and ∆S2 . From the inequality (3.6.5), it follows that ∆S1 + ∆S2 > 0 .

(3.6.7)

We now assume that system 2 is a heat bath, which is large compared to system 1 and which remains at the temperature T throughout the process. The quantity of heat transferred to system 1 is denoted by ∆Q1 . For system 2, the process occurs quasistatically, so that its entropy change ∆S2 is related to the heat transferred, −∆Q1 , by 1 ∆S2 = − ∆Q1 . T Inserting this into Eq. (3.6.7), we ﬁnd ∆S1 >

1 ∆Q1 . T

(3.6.8)

In all the preceding relations, the quantities ∆S and ∆Q are by no means required to be small, but instead represent simply the change in the entropy and the quantity of heat transferred. In the preceding discussion, we have considered the initial state and as ﬁnal state a state of overall equilibrium. In fact, these inequalities hold also for portions of the relaxation process. Each intermediate step can be represented in terms of equilibrium states with constraints, whereby the limitations imposed by the constraints decrease in the course of time. At the same time, the entropy increases. Thus, for each inﬁnitesimal step in time, the change in entropy of the isolated overall system is given by dS ≥ 0 .

(3.6.6 )

For the physical situation described under B, we have dS1 ≥

1 δQ1 . T

(3.6.8 )

106

3. Thermodynamics

We now summarize the content of the First and Second Laws. The First Law : dE = δQ + δW + δEN

(3.6.9)

Change of energy = heat transferred + work performed + energy change due to transfer of matter; E is a state function. The Second Law : δQ ≤ T dS

(3.6.10)

and S is a state function. a) For reversible changes: δQ = T dS. b) For irreversible changes: δQ < T dS. Notes: (i) The equals sign in Eq. (3.6.10) holds also for irreversible quasistatic processes in those subregions which are in equilibrium in each step of the process (see Sect. 3.6.3.1). (ii) In (3.6.10), we have combined (3.6.6 ) and (3.6.8 ). The situation of the isolated system (3.6.6) is included in (3.6.10), since in this case δQ = 0 (see the example 3.6.3.1). (iii) In many processes, the particle number remains constant (dN = 0). Therefore, we often employ (3.6.9) considering only δQ and δW , without mentioning this expressly each time.

We now wish to apply the Second Law to a process which leads from a state A to a state B as indicated in Fig. 3.10. If we integrate (3.6.10), we obtain B

B dS ≥

A

δQ T

A

and from this, B SB − S A ≥

δQ . T

(3.6.11)

A

For reversible processes, the equals sign holds; for irreversible ones, the inequality. In a reversible process, the state of the system can be completely characterized at each moment in time by a point in the P –V -diagram. In an irreversible process leading from one equilibrium state (possibly with constraints) A to another equilibrium state B, this is not in general the case. This is indicated by the dashed line in Fig. 3.10.

3.6 The First and Second Laws of Thermodynamics

Fig. 3.10. The path of a process connecting two thermodynamic states A and B

107

Fig. 3.11. A cyclic process, represented by a closed curve in the P − V diagram, which leads back to the starting point (B = A), whereby at least to some extent irreversible changes of state occur.

We consider the following special cases: (i) An adiabatic process: For an adiabatic process (δQ = 0), it follows from (3.6.11) that SB ≥ S A

or

∆S ≥ 0 .

(3.6.11 )

The entropy of a thermally isolated system cannot decrease. This statement is more general than Eq. (3.6.6), where completely isolated systems were assumed. (ii) Cyclic processes: For a cyclic process, the ﬁnal state is identical with the initial state, B = A (Fig. 3.11). Then we have SB = SA and and it follows from Eq. (3.6.11) for a cyclic process that the inequality ! δQ (3.6.12) 0≥ T holds, where the line integral is calculated along the closed curve of Fig. 3.11, corresponding to the actual direction of the process. ∗

3.6.2 Historical Formulations of the Laws of Thermodynamics and other Remarks The First Law There exists no perpetual motion machine of the ﬁrst kind (A perpetual motion machine of the ﬁrst kind refers to a machine which operates periodically and functions only as a source of energy). Energy is conserved and heat is only a particular form of energy, or more precisely, energy transfer. The recognition of the fact that heat is only a form of energy and not a unique material which can penetrate all material bodies was the accomplishment of Julius Robert Mayer (a physician, 1814–1878) in 1842.

108

3. Thermodynamics

James Prescott Joule (a brewer of beer) carried out experiments in the years 1843-1849 which demonstrated the equivalence of heat energy and the energy of work 1 cal = 4.1840 × 107 erg = 4.1840 Joule . The First Law was mathematically formulated by Clausius: δQ = dE + P dV . The historical formulation quoted above follows from the First Law, which contains the conservation of energy and the statement that E is a state variable. Thus, if a machine has returned to its initial state, its energy must be the same as before and it can therefore not have given up any energy to its environment. Second Law Rudolf Clausius (1822–1888) in 1850 : Heat can never pass on its own from a cooler reservoir to a warmer one. William Thomson (Lord Kelvin, 1824–1907) in 1851: The impossibility of a perpetual motion machine of the second kind. (A perpetual motion machine of the second kind refers to a periodically operating machine, which only extracts heat from a single reservoir and performs work.) These formulations are equivalent to one another and to the mathematical formulation. Equivalent formulations of the Second Law. The existence of a perpetual motion machine of the second kind could be used to remove heat from a reservoir at the temperature T1 . The resulting work could then be used to heat a second reservoir at the higher temperature T2 . The correctness of Clausius’ statement thus implies the correctness of Kelvin’s statement. If heat could ﬂow from a colder bath to a warmer one, then one could use this heat in a Carnot cycle (see Sect. 3.7.2) to perform work, whereby part of the heat would once again be taken up by the cooler bath. In this overall process, only heat would be extracted from the cooler bath and work would be performed. One would thus have a perpetual motion machine of the second kind. The correctness of Kelvin’s statement thus implies the correctness of Clausius’ statement. The two verbal formulations of the Second Law, that of Clausius and that of Kelvin, are thus equivalent. It remains to be demonstrated that Clausius’ statement is equivalent to the diﬀerential form of the Second Law (Eq. 3.6.10). To this end, we note that it will be shown in Sect. 3.6.3.1 from (3.6.10) that heat passes from a warmer reservoir to a cooler one. Clausius’ statement follows from (3.6.10). Now we must only demonstrate that the relation (3.6.10) follows from Clausius’ statement. This can be seen as follows: if instead of (3.6.10), conversely T dS < δQ would hold, then it would follow form the consideration of the quasistatic temperature equilibration that heat would be transported from a cooler to a warmer bath; i.e. that Clausius’ statement is false. The correctness of Clausius’ statement thus implies the correctness of the mathematical formulation of the Second Law (3.6.10).

3.6 The First and Second Laws of Thermodynamics

109

All the formulations of the Second Law are equivalent. We have included these historical considerations here because precisely their verbal formulations show the connection to everyday consequences of the Second Law and because this type of reasoning is typical of thermodynamics.

The Zeroth Law When two systems are in thermal equilibrium with a third system, then they are in equilibrium with one another. Proof within statistical mechanics: Systems 1, 2, and 3. Equilibrium of 1 with 3 implies that T1 = T3 and that of 2 with 3 that T2 = T3 ; it follows from this that T1 = T2 , i.e. 1 and 2 are also in equilibrium with one another. The considerations for the pressure and the chemical potential are exactly analogous. This fact is of course very important in practice, since it makes it possible to determine with the aid of thermometers and manometers whether two bodies are at the same temperature and pressure and will remain in equilibrium or not if they are brought into contact. The Third Law The Third Law (also called Nernst’s theorem) makes statements about the temperature dependence of thermodynamic quantities in the limit T → 0; it is discussed in the Appendix A.1. Its consequences are not as far-reaching as those of the First and Second Laws. The vanishing of speciﬁc heats as T → 0 is a direct result of quantum mechanics. In this sense, its postulation in the era of classical physics can be regarded as visionary.

3.6.3 Examples and Supplements to the Second Law We now give a series of examples which clarify the preceding concepts and general results, and which have also practical signiﬁcance. 3.6.3.1 Quasistatic Temperature Equilibration We consider two bodies at the temperatures T1 and T2 and with entropies S1 and S2 . These two bodies are connected by a poor thermal conductor and are insulated from their environment (Fig. 3.12). The two temperatures are diﬀerent: T1 = T2 ; thus, the two bodies are not in equilibrium with each other. Since the thermal conductor has a poor conductivity, all energy transfers occur slowly and each subsystem is in thermal equilibrium at each moment in time. Therefore, for a heat input δQ to body 1 and thus the equal but opposite heat transfer −δQ from body 2, the Second Law applies to both

110

3. Thermodynamics

subsystems in the form

dS1 =

δQ , T1

dS2 = −

δQ . T2

(3.6.13)

Fig. 3.12. Quasistatic temperature equilibration of two bodies connected by a poor conductor of heat

For the overall system, we have dS1 + dS2 > 0 ,

(3.6.14)

since the total entropy increases during the transition to the equilibrium state. If we insert (3.6.13) into (3.6.14), we obtain

1 1 δQ >0. (3.6.15) − T1 T2 We take T2 > T1 ; then it follows from (3.6.13) that δQ > 0, i.e. heat is transferred from the warmer to the cooler container. We consider here the diﬀerential substeps, since the temperatures change in the course of the process. The transfer of heat continues until the two temperatures have equalized; the total amount of heat transferred from 2 to 1, δQ, is positive. Also in the case of a non-quasistatic temperature equilibration, heat is transferred from the warmer to the cooler body: if the two bodies mentioned above are brought into contact (again, of course, isolated from their environment, but without the barrier of a poor heat conductor), the ﬁnal state is the same as in the case of the quasistatic process. Thus also in the nonquasistatic temperature equilibration, heat has passed from the warmer to the cooler body. 3.6.3.2 The Joule–Thomson Process The Joule–Thomson process consists of the controlled expansion of a gas (cf. Fig. 3.13). Here, the stream of expanding gas is limited by a throttle valve. The gas volume is bounded to the left and the right of the throttle by the two sliding pistons S1 and S2 , which produce the pressures P1 and P2 in the left and right chambers, with P1 > P2 . The process is assumed to occur adiabatically, i.e. δQ = 0 during the entire process. In the initial state (1), the gas in the left-hand chamber has the volume V1 and the energy E1 . In the ﬁnal state, the gas is entirely in the right-hand

3.6 The First and Second Laws of Thermodynamics

111

Fig. 3.13. A Joule–Thomson process, showing the sliding pistons S1 and S2 and the throttle valve T

chamber and has a volume V2 and energy E2 . The left piston performs work on the gas, while the gas performs work on the right piston and thus on the environment. The diﬀerence of the internal energies is equal to the total work performed on the system: 2 E2 − E1 =

2 dE =

1

0 δW =

1

V2 dV1 (−P1 ) +

V1

dV2 (−P2 ) 0

= P1 V1 − P2 V2 . From this it follows that the enthalpy remains constant in the course of this process: H 2 = H1 ,

(3.6.16)

where the deﬁnition Hi = Ei + Pi Vi was used. For cryogenic engineering it is important to know whether the gas is cooled by the controlled expansion. This is determined by the Joule–Thomson coeﬃcient: ∂H ∂S

T ∂P T ∂V +V ∂T ∂P T ∂T P − V T ∂S = − ∂H = − = . ∂P H CP T ∂T P ∂T P In the rearrangement, we have used (3.2.21), dH = T dS + V dP , and the Maxwell relation (3.2.10). Inserting the thermal expansion coefﬁcient α, we ﬁnd the following expression for the Joule–Thomson coeﬃcient:

∂T V = (T α − 1) . (3.6.17) ∂P H CP For an ideal gas, α = T1 ; in this case, there is no change in the temperature on expansion. For a real gas, either cooling or warming can occur. When α > T1 , the expansion leads to a cooling of the gas (positive Joule–Thomson eﬀect). When α < T1 , then the expansion gives rise to a warming (negative Joule–Thomson eﬀect). The limit between these two eﬀects is deﬁned by the inversion curve, which is given by α=

1 . T

(3.6.18)

112

3. Thermodynamics

We shall now calculate the inversion curve for a van der Waals gas, beginning with the van der Waals equation of state (Chap. 5) P =

a kT − v − b v2

,

v=

V . N

(3.6.19)

We diﬀerentiate the equation of state with respect to temperature at constant pressure

kT k ∂v 2a ∂v − 0= + 3 . v − b (v − b)2 ∂T P v ∂T P In this expression, we insert the condition (3.6.18)

1 1 ∂v = α≡ v ∂T P T ∂v k 1 and thereby obtain 0 = kv − v−b + 2a for ∂T v 3 T (v − b). Using the van-derP Waals equation again, we ﬁnally ﬁnd for the inversion curve b a 2a 0=− P + 2 + 3 (v − b) , v v v that is P =

2a 3a − 2 bv v

.

(3.6.20)

In the limit of low density, we can neglect the second term in (3.6.20) and the inversion curve is then given by P =

2a kTinv = bv v

,

Tinv =

2a = 6.75 Tc . bk

(3.6.21)

Here, Tc is the critical temperature which follows from the van der Waals equation (5.4.13). For temperatures which are higher than the inversion temperature Tinv , the Joule–Thomson eﬀect is always negative. The inversion temperature and other data for some gases are listed in Table I.4 in the Appendix. The change in entropy in the Joule–Thomson process is determined by

∂S V (3.6.22) =− , ∂P H T as can be seen using dH = T dS + V dP = 0. Since the pressure decreases, we obtain for the entropy change dS > 0, although δQ = 0. The Joule–Thomson process is irreversible, since its initial state with diﬀering pressures in the two chambers is clearly not an equilibrium state. The complete inversion curve from the van der Waals theory is shown in Fig. 3.14a,b. Within the inversion curve, the expansion leads to cooling of the gas.

3.6 The First and Second Laws of Thermodynamics

(a) The inversion curve for the Joule– Thomson eﬀect (upper solid curve). The isotherm is for T = 6.75 Tc (dotdashed curve). The shaded region is excluded, since in this region, the vapor and liquid phases are always both present.

113

(b) The inversion curve in the P -T diagram.

Fig. 3.14. The inversion curve for the Joule–Thomson eﬀect

3.6.3.3 Temperature Equilibration of Ideal Gases We will now investigate the thermal equilibration of two monatomic ideal gases (a and b). Suppose the two gases to be separated by a sliding piston and insulated from their environment (Fig. 3.15).

Fig. 3.15. The thermal equilibration of two ideal gases

The pressure of the two gases is taken to be equal, Pa = Pb = P , while their temperatures are diﬀerent in the initial state, Ta = Tb . Their volumes and particle numbers are given by Va , Vb and Na , Nb , so that the total volume and total particle number are V = Va + Vb and N = Na + Nb . The entropy of the initial state is given by

# "

Va Vb 5 5 S = Sa + S b = k N a + Nb . (3.6.23) + log + log 2 Na λ3a 2 Nb λ3b

114

3. Thermodynamics

The temperature after the establishment of equilibrium, when the temperatures of the two systems must approach the same value according to Chap. 2, will be denoted by T . Owing to the conservation of energy, we have 32 N kT = 32 Na kTa + 32 Nb kTb , from which it follows that T =

Na T a + Nb T b = c a T a + cb T b , Na + Nb

(3.6.24)

where we have introduced the ratio of the particle numbers, ca,b = recall the deﬁnition of the thermal wavelengths h λa,b = , 2πma,b kTa,b

Na,b N .

We

h . λa,b = 2πma,b kT

The entropy after the establishment of equilibrium is " # " # Va Vb 5 5 + log + log S = kNa + kN , b 2 Na λ3 2 Nb λ3 a b so that for the entropy increase, we ﬁnd S − S = kNa log

Va λ3a Vb λ3b + kN log . b Va λ3 Vb λ3 a b

(3.6.25)

We shall also show that the pressure remains unchanged. To this end, we add the two equations of state of the subsystems before the establishment of thermal equilibrium Va P = Na kTa ,

Vb P = Nb kTb

(3.6.26a)

and obtain using (3.6.24) the expression (Va + Vb )P = (Na + Nb )kT .

(3.6.26b)

From the equations of state of the two subsystems after the establishment of equilibrium P = Na,b kT Va,b

(3.6.26a)

with Va + Vb = V , it follows that V P = (Na + Nb )kT ,

(3.6.26b )

i.e. P = P . Incidentally, in (3.6.24) and (3.6.26b ), the fact is used that the two monatomic gases have the same speciﬁc heat. Comparing (3.6.26b) and (3.6.26b ), we ﬁnd the volume ratios Va,b T = . Va,b Ta,b

3.6 The First and Second Laws of Thermodynamics

115

From this we obtain S − S =

T Na +Nb 5 k log Na Nb , 2 Ta Tb

which ﬁnally yields S − S =

T c a T a + cb T b 5 5 kN log ca cb = kN log . 2 Ta Tb 2 Taca Tbcb

(3.6.27)

Due to the convexity of the exponential function, we have Taca Tbcb = exp(ca log Ta + cb log Tb ) ≤ ca exp log Ta + cb exp log Tb = c a T a + cb T b = T , and thus it follows from (3.6.27) that S − S ≥ 0, i.e. the entropy increases on thermal equilibration. Note: Following the equalization of temperatures, in which heat ﬂows from the warmer to the cooler parts of the system, the volumes are given by: Va =

Na V , Na + Nb

Vb =

Nb V . Na + Nb

Together with Eq. (3.6.26b), this gives Va /Va = T /Ta and Vb /Vb = T /Tb . The energy which is put into subsystem a is ∆Ea = 32 Na k(T − Ta ). The enthalpy increase in subsystem a is given by ∆Ha = 52 Na k(T − Ta ). Since the process is isobaric, we have ∆Qa = ∆Ha . The work performed on subsystem a is therefore equal to ∆Wa = ∆Ea − ∆Qa = −Na k(T − Ta ) . The warmer subsystem gives up heat. Since it would then be too rareﬁed for the pressure P , it will be compressed, i.e. it takes on energy through the work performed in this compression.

3.6.3.4 Entropy of Mixing We now consider the process of mixing of two diﬀerent ideal gases with the masses ma and mb . The temperatures and pressures of the gases are taken to be the same, Ta = Tb = T ,

Pa = Pb = P .

From the equations of state, Va P = Na kT ,

Vb P = Nb kT

116

3. Thermodynamics

Fig. 3.16. The mixing of two gases

it follows that Nb Na + Nb Na = = . Va Vb Va + Vb Using the thermal wavelength λa,b = √

h 2πma,b kT

, the entropy when the gases

are separated by a partition is given by

# "

Va Vb 5 5 + log + log + Nb . (3.6.28) S = Sa + S b = k N a 2 Na λ3a 2 Nb λ3b After removal of the partition and mixing of the gases, the value of the entropy is

# "

Va + Vb Va + Vb 5 5 S = k Na + log + log + N . (3.6.29) b 2 Na λ3a 2 Nb λ3b From Eqns. (3.6.28) and (3.6.29), we obtain the diﬀerence in the entropies: S − S = k log

(Na + Nb )Na +Nb = k(Na + Nb ) log NaNa NbNb

1 ccaa ccbb

>0,

where we have used the relative particle numbers ca,b =

Na,b . Na + Nb

Since the argument of the logarithm is greater than 1, we ﬁnd that the entropy of mixing is positive, e.g. Na = Nb ,

S − S = 2kNa log 2 .

The entropy of mixing always occurs when diﬀerent gases interdiﬀuse, even when they consist of diﬀerent isotopes of the same element. When, in contrast, the gases a and b are identical, the value of the entropy on removing

3.6 The First and Second Laws of Thermodynamics

117

the partition is Sid

" = k(Na + Nb )

Va + Vb 5 + log 2 (Na + Nb )λ3

# (3.6.29 )

and λ = λa = λb . We then have Sid − S = k log

(Va + Vb )Na +Nb NaNa NbNb (Na + Nb )Na +Nb VaNa VbNb

=0

making use of the equation of state; therefore, no entropy of mixing occurs. This is due to the factor 1/N ! in the basic phase-space volume element in Eqns. (2.2.2) and (2.2.3), which results from the indistinguishability of the particles. Without this factor, Gibbs’ paradox would occur, i.e. we would ﬁnd a positive entropy of mixing for identical gases, as mentioned following Eq. (2.2.3). ∗

3.6.3.5 Heating a Room

Finally, we consider an example, based on one given by Sommerfeld.7 A room is to be heated from 0◦ C to 20◦ C. What quantity of heat is required? How does the energy content of the room change in the process? If air can leave the room through leaks around the windows, for example, then the process is isobaric, but the number of air molecules in the room will decrease in the course of the heating process. The quantity of heat required depends on the increase in temperature through the relation δQ = CP dT ,

(3.6.30)

where CP is the heat capacity at constant pressure. In the temperature range that we are considering, the rotational degrees of freedom of oxygen, O2 , and nitrogen, N2 , are excited (see Chap. 5), so that under the assumption that air is an ideal gas, we have CP =

7 Nk , 2

(3.6.31)

where N is the overall number of particles. The total amount of heat required is found by integrating (3.6.31) between the initial and ﬁnal temperatures, T1 and T2 : ZT2 Q=

dT CP .

(3.6.32)

T1

If we initially neglect the temperature dependence of the particle number, and thus the heat capacity (3.6.31), we ﬁnd Q = CP (T2 − T1 ) = 7

7 N1 k(T2 − T1 ) . 2

(3.6.32 )

A. Sommerfeld, Thermodynamics and Statistical Mechanics: Lectures on Theoretical Physics, Vol. V, (Academic Press, New York, 1956)

118

3. Thermodynamics

Here, we have denoted the particle number at T1 as N1 and taken it to be constant. Equation (3.6.32 ) will be a good approximation, as long as T2 ≈ T1 . If we wish to take into account the variation of the particle number within the room (volume V ), we have to replace N in Eq. (3.6.31) by N from the equation of state, N = P V /kT , and it follows that ZT2 dT

Q=

7 T2 7 PV 7 T2 = P V log = N1 kT1 log . 2 T 2 T1 2 T1

(3.6.33)

T1

With log

T2 T1

=

T2 T1

−1+O

“`

T2 T1

´2 ” −1 , we obtain from (3.6.33) for small temperature

diﬀerences the approximate formula (3.6.32 ) « „ 20 7 T2 − T1 3.5 × 2 11 dyn 106 (V m3 ) = 3.5 106 Q = PV = 10 erg (V m3 ) 2 T1 cm2 273 2.73 = 6 kcal (V m3 ). It is instructive to compute the change in the energy content of the room on heating, taking into account the fact that the rotational degrees of freedom are fully excited, T Θr (see Chap. 5). Then the internal energy before and after the heating procedure is 1 5 Ei = Ni kTi − Ni kΘr + Ni εel 2 6 5 1 E2 − E1 = k(N2 T2 − N1 T1 ) − P V Θr 2 6

„

1 1 − T2 T1

« + PV

εel k

„

1 1 − T2 T1

« .

(3.6.34) The ﬁrst term is exactly zero, and the second one is positive; the third, dominant term is negative. The internal energy of the room actually decreases upon heating. The heat input is given up to the outside world, in order to increase the temperature in the room and thus the average kinetic energy of the remaining gas molecules. Heating with a ﬁxed particle number (a hermetically sealed room) requires a quantity of heat Q = CV (T2 − T1 ) ≡ 52 N1 k(T2 − T1 ). For small temperature diﬀerences T2 − T1 , it is then more favorable ﬁrst to heat the room to the ﬁnal temperature T2 and then to allow the pressure to decrease. The point of intersection of the two curves (P, N ) constant and P constant, with N variable (Fig. 3.17) at T20 is determined by T20 − T1 T1 log

T20 T1

=

CP . CV

A numerical estimate yields T20 = 1.9 T1 for the point of intersection in Fig. 3.17, i.e. at T1 = 273 K, T20 = 519 K. For any process of space heating, isolated heating is more favorable. The diﬀerence in the quantities of heat required is ∆Q ≈ (CP − CV )(T2 − T1 ) =

1 6 kcal (V m3 ) = 1.7 kcal (V m3 ) . 3.5

3.6 The First and Second Laws of Thermodynamics

119

Fig. 3.17. The quantity of heat required for space heating: as an isobaric process (solid curve), isochore (dashed curve), or isobaric neglecting the decrease in particle number (dot-dashed curve).

All of the above considerations have neglected the heat capacity of the walls. They are applicable to a rapid heating of the air. The change in pressure on heating a ﬁxed amount of air by 20◦ C is, however, δT 20 δP = ∼ ∼ 0.07 , i.e. δP ∼ 0.07 bar ∼ 0.07 kg/cm2 ∼ 700 kg/m2 ! P T 273

∗

3.6.3.6 The Irreversible, Quasistatic Gay-Lussac Experiment

We recall the diﬀerent versions of the Gay-Lussac experiment. In the irreversible form, we have ∆Q = 0 and ∆S > 0 (3.5.1). In the reversible case (isothermal or adiabatic), using (3.5.9) and (3.5.14), the corresponding relation for reversible processes is fulﬁlled. It is instructive to carry out the Gay-Lussac experiment in a quasistatic, irreversible fashion. One can imagine that the expansion does not take place suddenly, but instead is slowed by friction of the piston to the point that the gas always remains in equilibrium. The frictional heat can then either be returned to the gas or given up to the environment. We begin by treating the ﬁrst possibility. Since the frictional heat from the piston is returned to the gas, there is no change in the environment after each step in the process. The ﬁnal result corresponds to the situation of the usual Gay-Lussac experiment. For the moment, we denote the gas by an index 1 and the piston, which initially takes up the frictional heat, by 2. Then the work which the gas performs on expansion by the volume change dV is given by δW1→2 = P dV . This quantity of energy is passed by the piston to 1: δQ2→1 = δW1→2 . The energy change of the gas is dE = δQ2→1 − δW1→2 = 0. Since the gas is always in equilibrium at each instant, the relation dE = T dS − P dV also holds and thus we have for the entropy increase of the gas: T dS = δQ2→1 > 0 . The overall system of gas + piston transfers no heat to the environment and also performs no work on the environment, i.e. δQ = 0 and δW = 0. Since the entropy

120

3. Thermodynamics

of the piston remains the same (for simplicity, we consider an ideal gas, whose temperature does not change), it follows that T dS > δQ. Now we consider the situation that the frictional heat is passed to the outside world. This means that δQ2→1 = 0 and thus T dS = 0, also dS = 0. The total amount of heat given oﬀ to the environment (heat loss δQL ) is δQL = δW1→2 > 0 . Here, again, the inequality −δQL < T dS is fulﬁlled, characteristic of the irreversible process. The ﬁnal state of the gas corresponds to that found for the reversible adiabatic process. There, we found ∆S = 0, Q = 0, and W > 0. Now, ∆S = 0, while QL > 0 and is equal to the W of the adiabatic, reversible process, from Eq. (3.5.12).

3.6.4 Extremal Properties In this section, we derive the extremal properties of the thermodynamic potentials. From these, we shall obtain the equilibrium conditions for multicomponent systems in various phases and then again the inequalities (3.3.5) and (3.3.6). We assume in this section that no particle exchange with the environment occurs, i.e. dNi = 0, apart from chemical reactions within the system. Consider the system in general not yet to be in equilibrium; then for example in an isolated system, the state is not characterized solely by E, V, and Ni , but instead we need additional quantities xα , which give e.g. the concentrations of the independent components in the diﬀerent phases or the concentrations of the components between which chemical reactions occur. Another situation not in equilibrium is that of spatial inhomogeneities.8 We now however assume that equilibrium with respect to the temperature and pressure is present, i.e. that the system is characterized by uniform (but variable) T and P values. This assumption may be relaxed somewhat. For the following derivation, it suﬃces that the system likewise be at the pressure P at the stage when work is being performed by the pressure P , and when it is exchanging heat with a reservoir at the temperature T , that it be at the temperature T . (This permits e.g. inhomogeneous temperature distributions during a chemical reaction in a subsystem.) Under these conditions, the First Law, Eq (3.6.9), is given by dE = δQ − P dV .

8

As an example, one could imagine a piece of ice and a solution of salt in water at P = 1 atm and −5◦ C. Each component of this system is in equilibrium within itself. If one brings them into contact, then a certain amount of the ice will melt and some of the NaCl will diﬀuse into the ice until the concentrations are such that the ice and the solution are in equilibrium (see the section on eutectics). The initial state described here – a non-equilibrium state – is a typical example of an inhibited equilibrium. As long as barriers impede (inhibit) particle exchange, i.e. so long as only energy and volume changes are possible, this inhomogeneous state can be described in terms of equilibrium thermodynamics.

3.6 The First and Second Laws of Thermodynamics

121

Our starting point is the Second Law, (3.6.10): dS ≥

δQ . T

(3.6.35)

We insert the First Law into this equation and obtain dS ≥

1 (dE + P dV ) . T

(3.6.36a)

We have used the principle of energy conservation from equilibrium thermodynamics here, which however also holds in non-equilibrium states. The change in the energy is equal to the heat transferred plus the work performed. The precondition is that during the process a particular, well-deﬁned pressure is present. If E, V are held constant, then according to Eq. (3.6.36a), we have dS ≥ 0

for E, V ﬁxed ;

(3.6.36b)

that is, an isolated system tends towards a maximum of the entropy. When a non-equilibrium state is characterized by a parameter x, its entropy has the form indicated in Fig. 3.18. It is maximal for the equilibrium value x0 . The parameter x could be e.g. the volume of the energy of a subsystem of the isolated system considered. One refers to a process or variation as virtual – that is, possible in principle – if it is permitted by the conditions of a system. An inhomogeneous distribution of the energies of the subsystems with constant total energy would, to be sure, not occur spontaneously, but it is possible. In equilibrium, the entropy is maximal with respect to all virtual processes. We now consider the free enthalpy or Gibbs’ free energy, G = E − TS + PV ,

(3.6.37)

which we deﬁne with Eq. (3.6.37) for non-equilibrium states just as for equilibrium states. For the changes in such states, we ﬁnd from (3.6.36a) that the inequality dG ≤ −SdT + V dP

(3.6.38a)

holds. For the case that T and P are held constant, it follows from (3.6.38a) that dG ≤ 0

for T and P ﬁxed,

(3.6.38b)

i.e. the Gibbs’ free energy G tends towards a minimum. In the neighborhood of the minimum (Fig. 3.19), we have for a virtual (in thought only) variation δG = G(x0 + δx) − G(x0 ) =

1 G (x0 )(δx)2 . 2

(3.6.39)

122

3. Thermodynamics

Fig. 3.18. The entropy as a function of a parameter x, with the equilibrium value x0 .

Fig. 3.19. The free enthalpy as a function of a parameter.

The ﬁrst-order terms vanish, therefore in ﬁrst order we ﬁnd for δx: δG = 0

for T and P ﬁxed.9

(3.6.38c)

One terms this condition stationarity. Since G is minimal at x0 , we ﬁnd G (x0 ) > 0 .

(3.6.40)

Analogously, one can show for the free energy (Helmholtz free energy) F = E − T S and for the enthalpy H = E + P V that: dF ≤ −SdT − P dV

(3.6.41a)

dH ≤ T dS + V dP .

(3.6.42a)

and

These potentials also tend towards minimum values at equilibrium under the condition that their natural variables are held constant: dF ≤ 0

for T and V ﬁxed

(3.6.41b)

dH ≤ 0

for S and P ﬁxed .

(3.6.42b)

and

As conditions for equilibrium, it then follows that δF = 0

for T and V ﬁxed

(3.6.41c)

δH = 0

for S and P ﬁxed .

(3.6.42c)

and

9

This condition plays an important role in physical chemistry, since in chemical processes, the pressure and the temperature are usually ﬁxed.

3.6 The First and Second Laws of Thermodynamics

123

∗

3.6.5 Thermodynamic Inequalities Derived from Maximization of the Entropy We consider a system whose energy is E and whose volume is V . We decompose this system into two equal parts and investigate a virtual change of the energy and the volume of subsystem 1 by δE1 and δV1 . Correspondingly, the values for subsystem 2 change by −δE1 and −δV1 . The overall entropy before the change is

E V E V S(E, V ) = S1 , + S2 , . (3.6.43) 2 2 2 2 Therefore, the change of the entropy is given by

E E V V δS = S1 + δE1 , + δV1 + S2 − δE1 , − δV1 − S(E, V ) 2 2 2 2

∂S2 ∂S1 ∂S2 ∂S1 δE1 + δV1 − − = ∂E1 ∂E2 ∂V1 ∂V2

1 ∂ 2 S1 ∂ 2 S2 1 ∂ 2 S1 ∂ 2 S2 2 2 (δE (δV1 ) + + ) + + 1 2 ∂E12 ∂E22 2 ∂V12 ∂V22

2 ∂ 2 S2 ∂ S1 δE1 δV1 + . . . + + ∂E1 ∂V1 ∂E2 ∂V2 (3.6.44) From the stationarity of the entropy, δS = 0, it follows that the terms which are linear in δE1 and δV1 must vanish. This means that in equilibrium the temperature T and the pressure P of the subsystems must be equal T1 = T2 , P1 = P2 ;

(3.6.45a)

this is a result that is already familiar to us from equilibrium statistics. If we permit also virtual variations of the particle numbers, δN1 and −δN1 , in the subsystems 1 and 2, then an additional term enters the second ∂S1 ∂S2 line of (3.6.44): ∂N − ∂N2 δN1 ; and one obtains as an additional condition 1 for equilibrium the equality of the chemical potentials: µ1 = µ2 .

(3.6.45b)

Here, the two subsystems could also consist of diﬀerent phases (e.g. solid and liquid). We note that the second derivatives of S1 and S2 in (3.6.44) are both to be taken at the values E/2, V /2 and they are therefore equal. In the equilibrium state, the entropy is maximal, according to (3.6.36b). From this it follows that the coeﬃcients of the quadratic form (3.6.44) obey the two conditions

124

3. Thermodynamics

∂ 2 S1 ∂ 2 S2 = ≤0 2 ∂E1 ∂E22

(3.6.46a)

and ∂ 2 S1 ∂ 2 S1 − ∂E12 ∂V12

∂ 2 S1 ∂E1 ∂V1

2 ≥0.

(3.6.46b)

We now leave oﬀ the index 1 and rearrange the left side of the ﬁrst condition:

1 ∂T ∂2S 1 = =− 2 . (3.6.47a) 2 ∂E ∂E V T CV The left side of the second condition, Eq. (3.6.46b), can be represented by a Jacobian, and after rearrangement, ∂S ∂S ∂ T1 , PT ∂ T1 , PT ∂ (T, V ) ∂ ∂E , ∂V = = ∂ (E, V ) ∂ (E, V ) ∂ (T, V ) ∂ (E, V ) (3.6.47b)

1 1 1 ∂P = 3 . =− 3 T ∂V T CV T V κT CV If we insert the expressions (3.6.47a,b) into the inequalities (3.6.46a) and (3.6.46b), we obtain CV ≥ 0 ,

κT ≥ 0 ,

(3.6.48a,b)

which expresses the stability of the system. When heat is given up, the system becomes cooler. On compression, the pressure increases. Stability conditions of the type of (3.6.48a,b) are expressions of Le Chatelier’s principle: When a system is in a stable equilibrium state, every spontaneous change in its parameter leads to reactions which drive the system back towards equilibrium. The inequalities (3.6.48a,b) were already derived in Sect. 3.3 on the basis of the positivity of the mean square deviations of the particle number and the energy. The preceding derivation relates them within thermodynamics to the stationarity of the entropy. The inequality CV ≥ 0 guarantees thermal stability. If heat is transferred to part of a system, then its temperature increases and it releases heat to its surroundings, thus again decreasing its temperature. If its speciﬁc heat were negative, then the temperature of the subsystem would decrease on input of heat, and more heat would ﬂow in from its surroundings, leading to a further temperature decrease. The least input of heat would set oﬀ an instability. The inequality κT ≥ 0 guarantees mechanical stability. A small expansion of the volume of a region results in a decrease in its pressure, so that the surroundings, at higher pressure, compress the region again. If however κT < 0, then the pressure would increase in the region and the volume element would continue to expand.

3.7 Cyclic Processes

125

3.7 Cyclic Processes The analysis of cyclic processes played an important role in the historical development of thermodynamics and in the discovery of the Second Law of thermodynamics. Even today, their understanding is interesting in principle and in addition, it has eminent practical signiﬁcance. Thermodynamics makes statements concerning the eﬃciency of cyclic processes (periodically repeating processes) of the most general kind, which are of importance both for heat engines and thus for the energy economy, as well as for the energy balance of biological systems. 3.7.1 General Considerations In cyclic processes, the working substance, i.e. the system, returns at intervals to its initial state (after each cycle). For practical reasons, in the steam engine, and in the internal combustion engine, the working substance is replenished after each cycle. We assume that the process takes place quasistatically; thus, we can characterize the state of the system by two thermodynamic variables, e.g. P and V or T and S. The process can be represented as a closed curve in the P -V or the T -S plane (Fig. 3.20).

Fig. 3.20. A cyclic process: (a) in the P -V diagram; (b) in the T -S diagram

The work which is performed during one cycle is given by the line integral along the closed curve ! W = −W = P dV = A , (3.7.1) which is equal to the enclosed area A within the curve representing the cyclic process in the P -V diagram. The heat taken up during one cycle is given by ! Q = T dS = A . (3.7.2) Since the system returns to its initial state after a cycle, thus in particular the internal energy of the working substance is unchanged, it follows from the principle of conservation of energy that Q=W .

(3.7.3)

126

3. Thermodynamics

The heat taken up is equal to the work performed on the surroundings. The direction of the cyclic path and the area in the P -V and T -S diagrams are thus the same. When the cyclic process runs in a clockwise direction (righthanded process), then ∨

Q=W >0

(3.7.4a)

and one refers to a work engine. In the case that the process runs counterclockwise (left-handed process), we have Q=W T1 , and in between it is insulated. The motion of the piston is shown in Fig. 3.22.

Fig. 3.21. A Carnot cycle in (a) the P -V diagram and (b) the T -S diagram

Fig. 3.22. The sequence of the Carnot cycle

3.7 Cyclic Processes

127

1. Isothermal expansion: the system is brought into contact with the warmer heat bath at the temperature T2 . The quantity of heat Q2 = T2 (S2 − S1 )

(3.7.5a)

is taken up from the bath, while at the same time, work is performed on the surroundings. 2. Adiabatic expansion: the system is thermally insulated. Through an adiabatic expansion, work is performed on the outer world and the working substance cools from T2 to the temperature T1 . 3. Isothermal compression: the working substance is brought into thermal contact with the heat bath at temperature T1 and through work performed on it by the surroundings, it is compressed. The quantity of heat “taken up” by the working substance Q1 = T1 (S1 − S2 ) < 0

(3.7.5b)

is negative. That is, the quantity |Q1 | of heat is given up to the heat bath. 4. Adiabatic compression: employing work performed by the outside world, the now once again thermally insulated working substance is compressed and its temperature is thereby increased to T2 . After each cycle, the internal energy remains the same; therefore, the total work performed on the surroundings is equal to the quantity of heat taken up by the system, Q = Q1 + Q2 ; thus W = Q = (T2 − T1 )(S2 − S1 ) .

(3.7.5c)

The thermal eﬃciency (= work performed/heat taken up from the warmer heat bath) is deﬁned as η=

W . Q2

(3.7.6a)

For the Carnot machine, we obtain ηC = 1 −

T1 , T2

(3.7.6b)

where the index C stands for Carnot. We see that ηC < 1. The general validity of (3.7.6a) cannot be too strongly emphasized; it holds for any kind of working substance. Later, we shall show that there is no cyclic process whose eﬃciency is greater than that of the Carnot cycle. The Inverse Carnot Cycle Now, we consider the inverse Carnot cycle, in which the direction of the operations is counter-clockwise (Fig. 3.23). In this case, for the quantities of heat taken up from baths 2 and 1, we ﬁnd

128

3. Thermodynamics

Fig. 3.23. The inverse Carnot cycle

Q2 = T2 (S1 − S2 ) < 0 Q1 = T1 (S2 − S1 ) > 0 .

(3.7.7a,b)

The overall quantity of heat taken up by the system, Q, and the work performed on the system, W , are then given by Q = (T1 − T2 )(S2 − S1 ) = −W < 0 .

(3.7.8)

Work is performed by the outside world on the system. The warmer reservoir is heated further, and the cooler one is cooled. Depending on whether the purpose of the machine is to heat the warmer reservoir or to cool the colder one, one deﬁnes the heating eﬃciency or the cooling eﬃciency. The heating eﬃciency (= the heat transferred to bath 2/work performed) is H ηC =

T2 −Q2 = >1. W T2 − T1

(3.7.9)

H Since ηC > 1, this represents a more eﬃcient method of heating than the direct conversion of electrical energy or other source of work into heat (this type of machine is called a heat pump). The formula however also shows that the use of heat pumps is reasonable only as long as T2 ≈ T1 ; when the temperature of the heat bath (e.g. the Arctic Ocean) T1 T2 , it follows that |Q2 | ≈ |W|, i.e. it would be just as eﬀective to convert the work directly into heat. The cooling eﬃciency (= the quantity of heat removed from the cooler reservoir/work performed) is K ηC =

T1 Q1 = . W T2 − T1

(3.7.10)

For large-scale technical cooling applications, it is expedient to carry out the cooling process in several steps, i.e. as a cascade. 3.7.3 General Cyclic Processes We now take up a general cyclic process (Fig. 3.24), in which heat exchange with the surroundings can take place at diﬀerent temperatures, not necessarily only at the maximum and minimum temperature. We shall show, that

3.7 Cyclic Processes

Fig. 3.24. The general cyclic process

129

Fig. 3.25. The idealized (full curve) and real (dashed curve) sequence of the Carnot cycle

the eﬃciency η obeys the inequality η ≤ ηC ,

(3.7.11)

where ηC the eﬃciency of a Carnot cycle operating between the two extreme temperatures. We decompose the process into sections with heat uptake (δQ > 0) and heat output (δQ < 0), and also allow irreversible processes to take place W = Q = ∨ δQ = δQ + δQ = Q2 + Q1 . δQ>0

δQ0

0

(3.7.12)

δQ 0. In this connection we recall the process of isobaric heating discussed in Sect. 3.8.1. In the coexistence region, the temperature T remains constant, since the heat put into the system is consumed by the phase transition. From (3.8.10), QL = T ∆S > 0, it follows that ∆S > 0. This can also be read oﬀ Fig. 3.34b, whose general form results from the concavity of G and ∂G ∂T P = −S < 0. 3.8.2.2 Example Applications of the Clausius–Clapeyron Equation: We now wish to give some interesting examples of the application of the Clausius–Clapeyron equation. (i) Liquid → gaseous: since, according to the previous considerations, ∆S > 0 and the speciﬁc volume of the gas is larger than that of the liquid, ∆V > 0, 0 it follows that dP dT > 0, i.e. the boiling temperature increases with increasing pressure (Table I.5 and Figs. 3.28(b) and 3.29(b)). Table I.6 contains the heats of vaporization of some substances at their boiling points under standard pressure, i.e. 760 Torr. Note the high value for water. (ii) Solid → liquid: in the transition to the high-temperature phase, we have dT always ∆S > 0. Usually, ∆V > 0; then it follows that dP > 0. In the dT case of water, ∆V < 0 and thus dP < 0. The fact that ice ﬂoats on water implies via the Clausius–Clapeyron equation that its melting point decreases on increasing the pressure (Fig. 3.29). Note: There are a few other substances which expand on melting, e.g. mercury and bismuth. The large volume increase of water on melting (9.1%) is related to the open structure of ice, containing voids (the bonding is due to the formation of hydrogen bonds between the oxygen atoms, cf. Fig. 3.30). Therefore, the liquid phase is more dense. Above 4◦ C above the melting point Tm , the density of water begins to decrease on cooling (water anomaly) since local ordering occurs already at temperatures above Tm . While as a rule a solid material sinks within its own liquid phase (melt), ice ﬂoats on water, in such a way that about 9/10 of the ice is under the surface of the water. This fact together with the density anomaly of water plays a very important role in Nature and is fundamental for the existence of life on the Earth. The volume change upon melting if ice is VL − VS = (1.00 − 1.091) cm3 /g = −0.091 cm3 g−1 . The latent heat of melting per g is Q = 80 cal/g = 80 × 42.7 atm cm3 /g. From this, it follows that the slope of the melting curve of ice near 0◦ C is dP 80 × 42.7 atm =− = −138 atm/K . dT 273 × 0.091 K

(3.8.12)

3.8 Phases of Single-Component Systems

137

Fig. 3.30. The hexagonal structure of ice. The oxygen atoms are shown; they are connected to four neighbors via hydrogen bonds The melting curve as a function of the temperature is very steep. It requires a pressure increase of 138 atm to lower the melting temperature by 1 K. This “freezingpoint depression”, small as it is, enters into a number of phenomena in daily life. If a piece of ice at somewhat below 0◦ C is placed under increased pressure, it at ﬁrst begins to melt. The necessary heat of melting is taken from the ice itself, and it therefore cools to a somewhat lower temperature, so that the melting process is interrupted as long as no more heat enters the ice from its surroundings. This is the so-called regelation of ice (= the alternating melting and freezing of ice caused by changes in its temperature and pressure). Pressing together snow, which consists of ice crystals, to make a snowball causes the snow to melt to a small extent due to the increased pressure. When the pressure is released, it freezes again, and the snow crystals are glued together. The slickness of ice is essentially due to the fact that it melts at places where it is under pressure, so that between a sliding object and the surface of the ice there is a thin layer of liquid water, which acts like a lubricant, explaining e.g. the gliding motion of an ice skater. Part of the plasticity of glacial ice and its slow motion, like that of a viscous liquid, are also due to regelation of the ice. The lower portions of the glacier become movable as a result of the pressure from the weight of the ice above, but they freeze again when the pressure is released.

(iii) 3 He, liquid → solid: the phase diagram of 3 He is shown schematically in Fig. 3.31. At low temperatures, there is an interval where the melting curve falls. In this region, in the transition from liquid to solid (see the arrow in Fig. 3.31a), dP dT < 0; furthermore, it is found experimentally that the volume of the solid phase is smaller than that of the liquid (as is the usual case), ∆V < 0. We thus ﬁnd from the Clausius–Clapeyron equation (3.8.8) ∆S > 0, as expected from the general considerations in Remark (ii). The Pomeranchuk eﬀect: The fact that within the temperature interval mentioned above, the entropy increases on solidiﬁcation is called the Pomeranchuk eﬀect. It is employed for the purpose of reaching low temperatures (see Fig. 3.31b). Compression (dashed line) of liquid 3 He leads to its solidiﬁcation and, because of ∆S > 0, to the uptake of heat. This causes a decrease in the temperature of the substance. Compression therefore causes the phase transition to proceed along the melting curve (see arrow in Fig. 3.31b).

138

3. Thermodynamics

Fig. 3.31. The phase diagram of 3 He. (a) Isobaric solidiﬁcation in the range where dP < 0. (b) Pomeranchuk eﬀect dT

This eﬀect can be used to cool 3 He; with it, temperatures down to 2 × 10−3 K can be attained. The Pomeranchuk eﬀect, however, has nearly no practical signiﬁcance in low-temperature physics today.The currently most important methods for obtaining low temperatures are 3 He-4 He dilution (2 × 10−3 − 5 × 10−3 K) and adiabatic demagnetization of copper nuclei (1.5 × 10−6 − 12 × 10−6 K), where the temperatures obtained are shown in parentheses. (iv) The sublimation curve: We consider a solid (1), which is in equilibrium with a classical, ideal gas (2). For the volumes of the two phases, we have V1 V2 ; then it follows from the Clausius–Clapeyron equation (3.8.11) that dP QL = , dT T V2 where QL represents the latent heat of sublimation. For V2 , we insert the ideal gas equation, dP QL P = . dT kN T 2

(3.8.13)

This diﬀerential equation can be immediately integrated under the assumption that QL is independent of temperature: P = P0 e−q/kT ,

(3.8.14)

where q = QNL is the heat of sublimation per particle. Equation (3.8.14) yields the shape of the sublimation curve under the assumptions used. The vapor pressure of most solid materials is rather small, and in fact in most cases, no observable decrease with time in the amount of these substances due to evaporation is detected. Only a very few solid materials exhibit a readily observable sublimation and have as a result a noticeable vapor pressure, which increases with increasing temperature; among them are some solid perfume substances. Numerical values for the vapor pressure over ice and iodine are given in Tables I.8 and I.9.

3.8 Phases of Single-Component Systems

139

At temperatures well below 0◦ C and in dry air, one can observe a gradual disappearance of snow, which is converted directly into water vapor by sublimation. The reverse phenomenon is the direct formation of frost from water vapor in the air, or the condensation of snow crystals in the cool upper layers of the atmosphere. If iodine crystals are introduced into an evacuated glass vessel and a spot on the glass wall is cooled, then solid iodine condenses from the iodine vapor which forms in the vessel. Iodine crystals which are left standing in the open air, napthalene crystals (“moth balls”), and certain mercury salts, including “sublimate” (HgCl2 ), among others, gradually vanish due to sublimation.

3.8.3 The Convexity of the Free Energy and the Concavity of the Free Enthalpy (Gibbs’ Free Energy) We now return again to the gas-liquid transition, in order to discuss some additional aspects of evaporation and the curvature of the thermodynamic potentials. The coexistence region and the coexistence curve are clearly visible in the T -V diagram. Instead, one often uses a P -V diagram. From the projection of the three-dimensional P -V -T diagram, we can see the shape drawn in Fig. 3.32. From the shape of the isotherms in the P -V diagram, the ∂F free energy can be determined analytically and graphically. Owing to ∂V T = −P , it follows for the free energy that

Fig. 3.32. The isotherms PT (V ) and the free energy as a function of the volume during evaporation; the thin line is the coexistence curve

140

3. Thermodynamics

Fig. 3.33. The determination of the free enthalpy from the free energy by construction

V F (T, V ) − F (T, V0 ) = −

dV PT (V ) .

(3.8.15)

V0

One immediately sees that the isotherms in Fig. 3.32 lead qualitatively to the volume dependence of the free energy which is drawn below. The free energy is convex (curved upwards). The fundamental cause of this is the fact that the compressibility is positive: ∂2F 1 ∂P ∝ =− >0, 2 ∂V ∂V κT while

∂2F ∂T 2

=−

V

∂S ∂T

∝ −CV < 0 .

(3.8.16)

V

These inequalities are based upon the stability relations proved previously, (3.3.5, 3.3.6), and (3.6.48a,b). The free enthalpy or Gibbs’ free energy ∂FG(T, P ) = F + P V can be constructed from F (T, V ). Due to P = − ∂V , G(T, P ) is obtained from T F (T, V ) by constructing a tangent to F (T, V ) with the slope −P (see Fig. 3.33). The intersection of this tangent with the ordinate has the coordinates

∂F F (T, V ) − V = F + V P = G(T, P ) . (3.8.17) ∂V T The result of this construction is drawn in Fig. 3.34. The derivatives of the free enthalpy

∂G ∂G = V and = −S ∂P T ∂T P yield the volume and the entropy. They are discontinuous at a phase transition, which results in a kink in the curves. Here, P0 (T ) is the evaporation

3.8 Phases of Single-Component Systems

141

Fig. 3.34. The free enthalpy (Gibbs’ free energy) as a function of (a) the pressure and (b) the temperature.

pressure at the temperature T , and T0 (P ) is the evaporation temperature at the pressure P . From this construction, one can also see that the free enthalpy is concave (Fig. 3.34). The curvatures are negative because κT > 0 and CP > 0. The signs of the slopes result from V > 0 and S > 0. It is also readily seen from the ﬁgures that the entropy increases as a result of a transition to a higher-temperature phase, and the volume decreases as a result of a transition to a higher-pressure phase. These consequences of the stability conditions hold quite generally. In the diagrams (3.34a,b), the terms gas and liquid phases could be replaced by low-pressure and high-pressure or high-temperature and low-temperature phases. On melting, the latent heat must be added to the system, on freezing (solidifying), it must be removed. When heat is put into or taken out of a system at constant pressure, it is employed to convert the solid phase to the liquid or vice versa. In the coexistence region, the temperature remains constant during these processes. This is the reason why in late Autumn and early Spring the temperature near the Earth remains close to zero degrees Celsius, the freezing point of water.

3.8.4 The Triple Point At the triple point (Figs. 3.26 and 3.35), the solid, liquid and gas phases coexist in equilibrium. The condition for equilibrium of the gaseous, liquid and solid phases, or more generally for three phases 1, 2 and 3, is: µ1 (T, P ) = µ2 (T, P ) = µ3 (T, P ) ,

(3.8.18)

and it determines the triple point pressure and the triple point temperature Pt , Tt . In the P -T diagram, the triple point is in fact a single point. In the T -V diagram it is represented by the horizontal line drawn in Fig. 3.35b. Along this line, the three phases are in equilibrium. If the phase diagram is

142

3. Thermodynamics

Fig. 3.35. The triple point (a) in a P -T diagram (the phases are denoted by 1, 2, 3. The coexistence regions are marked as 3-2 etc., i.e. denoting the coexistence of phase 3 and phase 2 on the two branches of the coexistence curve.); (b) in a T -v diagram; and (c) in a v-s diagram

represented in terms of two extensive variables, such as e.g. by V and S as in Fig. 3.35c, then the triple point becomes a triangular area as is visible in the ﬁgure. At each point on this triangle, the states of the three phases 1, 2, and 3 corresponding to the vertices of the triangle coexist with one another. We now want to describe this more precisely. Let s1 , s2 and s3 be the entropies in the phases 1, 2 and 3 just at the triple point, per particle ∂µi si = − ∂T , and correspondingly, v1 , v2 , v3 are the speciﬁc volumes P Tt ,Pt i vi = ∂µ ∂P T Tt ,Pt . The points (si , vi ) are shown in the s-v diagram as points 1, 2, 3. Clearly, every pair of phases can coexist with each other; the lines connecting the points 1 and 2 etc. yield the triangle with vertices 1, 2, and 3. The coexistence curves phases, e.g. 1 and2, are in the found of two ∂µi i s-v diagram from si (T ) = − ∂µ and v (T ) = with i ∂T ∂P P0 (T ) P0 (T ) P

T

i = 1 and 2 along with the associated phase-boundary curve P = P0 (T ). Here, the temperature is a parameter; points on the two branches of the coexistence curves with the same value of T can coexist with each other. The diagram in 3.35c is only schematic. The (by no means parallel) lines within the two-phase coexistence areas show which of the pairs of single-component states can coexist with each other on the two branches of the coexistence line. Now we turn to the interior of the triangular area in Fig. 3.35c. It is immediately clear that the three triple-point phases 1, 2, 3 can coexist with each other at the temperature Tt and pressure Pt in arbitrary quantities. This also means that a given amount of the substance can be distributed among these three phases in arbitrary fractions c1 , c2 , c3 (0 ≤ ci ≤ 1) c 1 + c2 + c3 = 1 ,

(3.8.19a)

and then will have the total speciﬁc entropy c1 s1 + c2 s2 + c3 s3 = s

(3.8.19b)

3.8 Phases of Single-Component Systems

143

and the total speciﬁc volume c1 v1 + c2 v2 + c3 v3 = v .

(3.8.19c)

From (3.8.19a,b,c), it follows that s and v lie within the triangle in Fig. 3.35c. Conversely, every (heterogeneous) equilibrium state with the total speciﬁc entropy s and speciﬁc volume v can exist within the triangle, where c1 , c2 , c3 follow from (3.8.19a–c). Eqns. (3.8.19a–c) can be interpreted by the following center-of-gravity rule: let a point (s, v) within the triangle in the v-s diagram (see Fig. 3.35c) be given. The fractions c1 , c2 , c3 must be chosen in such a way that attributing masses c1 , c2 , c3 to the vertices 1, 2, 3 of the triangle leads to a center of gravity at the position (s, v). This can be immediately understood if one writes (3.8.19b,c) in the two-component form:

v v v v . (3.8.20) c1 1 + c2 2 + c3 3 = s s1 s2 s3

Remarks: (i) Apart from the center-of-gravity rule, the linear equations can be solved algebraically: 1 1 1 1 1 1 1 1 1 s s2 s 3 s 1 s s3 s 1 s2 s v v2 v3 v1 v v3 v1 v2 v c1 = , c2 = 1 1 1 , c3 = 1 1 1 . 1 1 1 s1 s 2 s 3 s1 s2 s3 s1 s2 s3 v1 v2 v3 v1 v2 v3 v1 v2 v3 (ii) Making use of the triple point gives a precise standard for a temperature and a pressure, since the coexistence of the three phases can be veriﬁed without a doubt. From Fig. 3.35c, it can also be seen that the triple point is not a point as a function of the experimentally controllable parameters, but rather the whole area of the triangle. The parameters which can be directly varied from outside the system are not P and T , but rather the volume V and the entropy S, which can be varied by performing work on the system or by transferring heat to it. If heat is put into the system at the point marked by a cross (Fig. 3.35c), then in the example of water, some ice would melt, but the state would still remain within the triangle. This explains why the triple point is insensitive to changes within wide limits and is therefore very suitable as a temperature ﬁxed point. (iii) For water, Tt = 273.16 K and Pt = 4.58 Torr. As explained in Sect. 3.4, the absolute temperature scale is determined by the triple point of water. In order to reach the triple point, one simply needs to distill highly pure water

144

3. Thermodynamics

Fig. 3.36. A triple-point cell: ice, water, and water vapor are in equilibrium with each other. A freezing mixture in contact with the inner walls causes some water to freeze there. It is then replaced by the thermometer bulb, and a ﬁlm of liquid water forms on the inner wall

into a container and to seal it oﬀ after removing all the air. One then has water and water vapor in coexistence (coexistence region 1-2 in Fig. 3.35c). Removing heat by means of a freezing mixture brings the system into the triple-point range. As long as all three phases are present, the temperature equals Tt (see Fig. 3.36).

3.9 Equilibrium in Multicomponent Systems 3.9.1 Generalization of the Thermodynamic Potentials We consider a homogeneous mixture of n materials, or as one says in this connection, components, whose particle numbers are N1 , N2 , . . . , Nn . We ﬁrst need to generalize the thermodynamic relations to this situation. To this end, we refer to Chap. 2. Now, the phase-space volume and similarly the entropy are functions of the energy, the volume, and all of the particle numbers: S = S(E, V, N1 , . . . , Nn ) .

(3.9.1)

All the thermodynamic relations can be generalized to this case by replacing N and µ by Ni and µi and summing over i. We deﬁne the chemical potential of the ith material by

∂S µi = −T (3.9.2a) ∂Ni E,V,{Nk=i } and, as before,

∂S 1 = T ∂E V,{Nk }

and

P = T

∂S ∂V

. E,{Nk }

(3.9.2b,c)

3.9 Equilibrium in Multicomponent Systems

145

Then for the diﬀerential of the entropy, we ﬁnd µi P 1 dE + dV − dNi , T T T i=1 n

dS =

(3.9.3)

and from it the First Law dE = T dS − P dV +

n

µi dNi

(3.9.4)

i=1

for this mixture. Die Gibbs–Duhem relation for homogeneous mixtures reads E = TS − PV +

n

µi Ni .

(3.9.5)

i=1

It is obtained analogously to Sect. 3.1.3, by diﬀerentiating αE = E(αS, αV, αN1 , . . . , αNn )

(3.9.6)

with respect to α. From (3.9.4) and (3.9.5), we ﬁnd the diﬀerential form of the Gibbs–Duhem relation for mixtures −SdT + V dP −

n

Ni dµi = 0 .

(3.9.7)

i=1

It can be seen from this relation that of the n + 2 variables (T, P, µ1 , . . . , µn ), only n + 1 are independent. The free enthalpy (Gibbs’ free energy) is deﬁned by G = E − TS + PV .

(3.9.8)

From the First Law, (3.9.4), we obtain its diﬀerential form: dG = −SdT + V dP +

n

µi dNi .

(3.9.9)

i=1

From (3.9.9), we can read oﬀ

∂G ∂G , V = , S=− ∂T P,{Nk } ∂P T,{Nk }

µi =

∂G ∂Ni

. T,P,{Nk=i }

(3.9.10) For homogeneous mixtures, using the Gibbs–Duhem relation (3.9.5) we ﬁnd for the free enthalpy (3.9.8) G=

n i=1

µi Ni .

(3.9.11)

146

3. Thermodynamics

Then we have S=−

n

∂µi i=1

∂T

Ni ,

P

V =

n

∂µi i=1

∂P

Ni .

(3.9.12)

T

The chemical potentials are intensive quantities and therefore depend only on T , P and the n − 1 concentrations c1 = NN1 , . . . , cn−1 = Nn−1 (N = N n i=1 Ni , cn = 1 − c1 − . . . − cn−1 ). The grand canonical potential is deﬁned by Φ = E − TS −

n

µi Ni .

(3.9.13)

i=1

For its diﬀerential, we ﬁnd using the First Law (3.9.4) dΦ = −SdT − P dV −

n

Ni dµi .

(3.9.14)

i=1

For homogeneous mixtures, we obtain using the Gibbs–Duhem relation (3.9.5) Φ = −P V .

(3.9.15)

The density matrix for mixtures depends on the total Hamiltonian and will be introduced in Chap. 5. 3.9.2 Gibbs’ Phase Rule and Phase Equilibrium We consider n chemically diﬀerent materials (components), which can be in r phases (Fig. 3.37) and between which no chemical reactions are assumed to take place. The following equilibrium conditions hold: Temperature T and pressure P must have uniform values in the whole system. Furthermore, for each component i, the chemical potential must be the same in each of the phases. These equilibrium conditions can be derived directly by considering the microcanonical ensemble, or also from the stationarity of the entropy.

Fig. 3.37. Equilibrium between 3 phases

3.9 Equilibrium in Multicomponent Systems

147

(i) As a ﬁrst possibility, let us consider a microcanonical ensemble consisting of n chemical substances, and decompose it into r parts. Calculating the probability of a particular distribution of the energy, the volume and the particle numbers over these parts, one obtains for the most probable distribution the equality of the temperature, pressure and the chemical potentials of each component . (ii) As a second possibility for deriving the equilibrium conditions, one can start from the maximization of the entropy in equilibrium, (3.6.36b) dS ≥

n 1 dE + P dV − µi dNi , T i=1

(3.9.16)

and can then employ the resulting stationarity of the equilibrium state for ﬁxed E, V , and {Ni }, δS = 0

(3.9.17)

with respect to virtual variations. One can then proceed as in Sect. 3.6.5, decomposing a system into two parts 1 and 2, and varying not only the energy and the volume, but also the particle numbers [see (3.6.44)]:

∂S1 ∂S2 ∂S1 ∂S2 δS = δE1 + δV1 − − ∂E1 ∂E2 ∂V1 ∂V2

(3.9.18) ∂S1 ∂S2 δNi,1 + . . . + − . ∂Ni,1 ∂Ni,2 i Here, Ni,1 (Ni,2 ) is the particle number of component i in the subsystem 1 (2). From the condition of vanishing variation, the equality of the temperatures and pressures follow: T1 = T 2 ,

P1 = P2

and furthermore µi,1 = µi,2

∂S1 ∂Ni,1

=

(3.9.19) ∂S2 ∂Ni,2 ,

i.e. the equality of the chemical potentials

for i = 1, . . . , n .

(3.9.20)

We have thus now derived the equilibrium conditions formulated at the beginning of this section, and we wish to apply them to n chemical substances in r phases (Fig. 3.37). In particular, we want to ﬁnd out how many phases can coexist in equilibrium. Along with the equality of temperature and pressure in the whole system, from (3.9.20) the chemical potentials must also be equal, (1)

(r)

µ1 = . . . = µ1 , ... = . . . = µ(r) µ(1) n n .

(3.9.21)

148

3. Thermodynamics

The upper index refers to the phases, and the lower one to the components. Equations (3.9.21) represent all together n(r−1) conditions on the 2+(n−1)r (1) (1) (r) (r) variables (T, P, c1 , . . . , cn−1 , . . . , c1 , . . . , cn−1 ). The number of quantities which can be varied (i.e. the number of degrees of freedom is therefore equal to f = 2 + (n − 1)r − n(r − 1): f =2+n−r .

(3.9.22)

This relation (3.9.22) is called Gibbs’ phase rule . In this derivation we have assumed that each substance is present in all r phases. We can easily relax this assumption. If for example substance 1 (1) is not present in phase 1, then the condition on µ1 does not apply. The particle number of component 1 then also no longer occurs as a variable in phase 1. One thus has one condition and one variable less than before, and Gibbs’ phase rule (3.9.22) still applies.12 Examples of Applications of Gibbs’ Phase Rule: (i) For single-component system, n = 1: r = 1, f = 2 T, P free r = 2, f = 1 P = P0 (T ) Phase-boundary curve r = 3, f = 0 Fixed point: triple point. (ii) An example for a two-component system, n = 2, is a mixture of sal ammoniac and water, NH4 Cl+H2 O. The possible phases are: water vapor (it contains practically no NH4 Cl), the liquid mixture (solution), ice (containing some of the salt), the salt (containing some H2 O). Possible coexisting phases are: • liquid phase: r = 1, f = 3 (variables P, T, c) • liquid phase + water vapor: r = 2, f = 2, variables P, T ; the concentration is a function of P and T : c = c(P, T ). • liquid phase + water vapor + one solid phase: r = 3, f = 1. Only one variable, e.g. the temperature, is freely variable. • liquid phase + vapor + ice + salt: r = 4, f = 0. This is the eutectic point. The phase diagram of the liquid and the solid phases is shown in Fig. 3.38. At the concentration 0, the melting point of pure ice can be seen, and at c = 1, that of the pure salt. Since the freezing point of a solution is lowered (see Chap. 5), we can understand the shape of the two branches of the freezingpoint curve as a function of the concentration. The two branches meet at the eutectic point. In the regions ice-liq., ice and liquid, and in liq.-salt, liquid and salt coexist along the horizontal lines. The concentration of NH4 Cl in the ice 12

The number of degrees of freedom is a statement about the intensive variables; there are however also variations of the extensive variables. For example, at a triple point, f = 0, the entropy and the volume can vary within a triangle (Sect. 3.8.4).

3.9 Equilibrium in Multicomponent Systems

149

is considerably lower than in the liquid mixture which is in equilibrium with it. The solid phases often contain only the pure components; then the lefthand and the right-hand limiting lines are identical with the two vertical lines at c = 0 and c = 1. At the eutectic point, the liquid mixture is in equilibrium with the ice and with the salt. If the concentration of a liquid is less than that corresponding to the eutectic point, then ice forms on cooling the system. In this process, the concentration in the liquid increases until ﬁnally the eutectic concentration is reached, at which the liquid is converted to ice and salt. The resulting mixture of salt and ice crystals is called the eutectic. At the eutectic concentration, the liquid has its lowest freezing point.

Fig. 3.38. The phase diagram of a mixture of sal ammoniac (ammonium chloride) and water. In the horizontally shaded regions, ice and liquid, liquid and solid salt, and ﬁnally ice and solid salt coexist with each other.

The phase diagram in Fig. 3.38 for the liquid and solid phases and the corresponding interpretation using Gibbs’ phase rule can be applied to the following physical situations: (i) when the pressure is so low that also a gaseous phase (not shown) is present; (ii) without the gas phase at constant pressure P , in which case a degree of freedom is unavailable; or (iii) in the presence of air at the pressure P and vapor dissolved in it with the partial pressure cP .13 The concentration of the vapor c in the air enters the chemical potential as log cP (see Chap. 5). It adjusts itself in such a way that the chemical potential of the water vapor is equal to the chemical potential in the liquid mixture. It should be pointed out that owing to the term log c, the chemical potential of the vapor dissolved in the air is lower than that of the pure vapor. While at atmospheric pressure, boiling begins only at 100◦ C, and then the whole liquid phase is converted to vapor, here, even at very low temperatures a suﬃcient amount enters the vapor phase to permit the log c term to bring about the equalization of the chemical potentials. The action of freezing mixtures becomes clear from the phase diagram 3.38. For example, if NaCl and ice at a temperature of 0◦ C are brought together, then they are not in equilibrium. Some of the ice will melt, and the salt will dissolve in the resulting liquid water. Its concentration is to be sure much too high to be in equilibrium with the ice, so that more ice melts. In the melting process, 13

Gibbs’ phase rule is clearly still obeyed: compared to (ii), there is one component (air) more and also one more phase (air-vapor mixture) present.

150

3. Thermodynamics

heat is taken up, the entropy increases, and thus the temperature is lowered. This process continues until the temperature of the eutectic point has been reached. Then the ice, hydrated salt, NaCl·2H2 O, and liquid with the eutectic concentration are in equilibrium with each other. For NaCl and H2 O, the eutectic temperature is −21◦ C. The resulting mixture is termed a freezing mixture. It can be used to hold the temperature constant at −21◦ C. Uptake of heat does not lead to an increase of the temperature of the freezing mixture, but rather to continued melting of the ice and dissolution of NaCl at a constant temperature.

Eutectic mixtures always occur when there is a miscibility gap between the two solid phases and the free energy of the liquid mixture is lower than that of the two solid phases (see problem 3.28). the melting point of the eutectic mixture is then considerably lower than the melting points of the two solid phases (see Table I.10). 3.9.3 Chemical Reactions, Thermodynamic Equilibrium and the Law of Mass Action In this section we consider systems with several components, in which the particle numbers can change as a result of chemical reactions. We ﬁrst determine the general condition for chemical equilibrium and then investigate mixtures of ideal gases. 3.9.3.1 The Condition for Chemical Equilibrium Reaction equations, such as for example 2H2 + O2 2H2 O ,

(3.9.23)

can in general be written in the form n

νj Aj = 0 ,

(3.9.24)

j=1

where the Aj are the chemical symbols and the stoichiometric coeﬃcients νj are (small) integers, which indicate the participation of the components in the reaction. We will adopt the convention that left indicates positive and right negative. The reaction equation (3.9.24) contains neither any information about the concentrations at which the Aj are present in thermodynamic and chemical equilibrium at a given temperature and pressure, nor about the direction in which the reaction will proceed. The change in the Gibbs free energy (≡ free enthalpy) with particle number at ﬁxed temperature T and ﬁxed pressure P for single-phase systems is14 14

Chemical reactions in systems consisting of several phases are treated in M.W. Zemansky and R.H. Dittman, Heat and Thermodynamics, Mc Graw Hill, Auckland, Sixth Edition, 1987.

3.9 Equilibrium in Multicomponent Systems

dG =

n

µj dNj .

151

(3.9.25)

j=1

In equilibrium, the Nj must be determined in such a way that G remains stationary, n

µj dNj = 0 .

(3.9.26)

j=1

If an amount dM participates in the reaction, then dNj = νj dM . The condition of stationarity then requires n

µj νj = 0 .

(3.9.27)

j=1

For every chemical reaction that is possible in the system, a relation of this type holds. It suﬃces for a fundamental understanding to determine the chemical equilibrium for a single reaction. The chemical potentials µj (T, P ) depend not only on the pressure and the temperature, but also on the relative particle numbers (concentrations). The latter adjust themselves in such a way in chemical equilibrium that (3.9.27) is fulﬁlled. In the case that substances which can react chemically are in thermal equilibrium, but not in chemical equilibrium, then from the change in Gibbs’ free energy, δG = δ µj (T, P )νj M (3.9.25 ) j

we can determine the direction which the reaction will take. Since G is a minimum at equilibrium, we must have δG ≤ 0; cf. Eq. (3.6.38b). The chemical composition is shifted towards the direction of smaller free enthalpy or lower chemical potentials. Remarks: (i) The condition for chemical equilibrium (3.9.27) can be interpreted to mean that the chemical potential of a compound is equal to the sum of the chemical potentials of its constituents. (ii) The equilibrium condition (3.9.27) for the reaction (3.9.24) holds also when the system consists of several phases which are in contact with each other and between which the reactants can pass. This is shown by the equality of the chemical potential of each component in all of the phases which are in equilibrium with each other. (iii) Eq. (3.9.27) can also be used to determine the equilibrium distribution of elementary particles which are transformed into one another by reactions.

152

3. Thermodynamics

For example, the distribution of electrons and positrons which are subject to pair annihilation, e− + e+ γ, can be found (see problem 3.31). These applications of statistical mechanics are important in cosmology, in the description of the early stages of the Universe, and for the equilibria of elementaryparticle reactions in stars. 3.9.3.2 Mixtures of Ideal Gases To continue the evaluation of the equilibrium condition (3.9.27), we require information about the chemical potentials. In the following, we consider reactions in (classical) ideal gases. In Sect. 5.2, we show that the chemical potential of particles of type j in a mixture of ideal molecular gases can be written in the form µj = fj (T ) + kT log cj P ,

(3.9.28a)

N

where cj = Nj holds and N is the total number of particles. The function fj (T ) depends solely on temperature and contains the microscopic parameters of the gas of type j. From (3.9.27) and (3.9.28a), it follows that eνj [fj (T )/kT +log(cj P )] = 1 . (3.9.29) j

According to Sect. 5.2, Eq. (5.2.4 ) is valid: fj (T ) = ε0el,j − cP,j T log kT − kT ζj .

(3.9.28b)

Inserting (3.9.28b) into (3.9.29) yields the product of the powers of the concentrations:

ν

cj j = K(T, P ) ≡ e

P j

νj (ζj −

ε0 el,j kT

)

P

(kT )

j

cP,j νj /k

P−

P j

νj

;

(3.9.30)

j

where ε0el,j is the electronic energy, cP,j the speciﬁc heat of component j at constant pressure, and ζj is the chemical constant 3/2

ζj = log

2mj

kΘr,j (2π2 )3/2

.

(3.9.31)

Here, we have assumed that Θr T Θv , with Θr and Θv the characteristic temperatures for the rotational and vibrational degrees of freedom, Eqs. (5.1.11) and (5.1.17). Equation (3.9.30) is the law of mass action for the concentrations. The function $ ν K(T, P ) is also termed the mass action constant. The statement that j cj j is a function of only T and P holds generally for ideal mixtures µj (T, P, {ci }) = µj (T, P, cj = 1, ci = 0(i = j)) + kT log cj .

3.9 Equilibrium in Multicomponent Systems

153

If, instead of the concentrations, we introduce the partial pressures (see remark (i) at the end of this section) Pj = cj P ,

(3.9.32)

then we obtain

P ν Pj j

= KP (T ) ≡ e

j

„ « ε0 el,j νj ζj − kT

P

(kT )

j

cP,j νj /k

,

(3.9.30 )

j

the law of mass action of Guldberg and Waage15 for the partial pressures, with KP (T ) independent of P . We now ﬁnd e.g. for the hydrogen-oxygen reaction of Eq. (3.9.23) 2H2 + O2 − 2H2 O = 0 , with νH2 = 2 ,

νO2 = 1 ,

νH2 O = −2 ,

(3.9.33)

the relation K(T, P ) =

P [H2 ]2 [O2 ] = const. e−q/kT T j cP,j νj /k P −1 . 2 [H2 O]

(3.9.34)

Here, the concentrations cj = [Aj ] are represented by the corresponding chemical symbols in square brackets, and we have used q = 2ε0H2 + ε0O2 − 2ε0H2 O > 0 , the heat of reaction at absolute zero, which is positive for the oxidation of hydrogen. The degree of dissociation α is deﬁned in terms of the concentrations: [H2 O] = 1 − α ,

[O2 ] =

α , 2

[H2 ] = α .

It then follows from (3.9.32) that P α3 −q/kT j cP,j νj /k P −1 , ∼ e T 2(1 − α)2

(3.9.35)

from which we can calculate α; α decreases exponentially with falling temperature. 15

The law of mass action was stated by Guldberg and Waage in 1867 on the basis of statistical considerations of reaction probabilities, and was later proved thermodynamically for ideal gases by Gibbs, who made it more speciﬁc through the calculation of K(T, P ).

154

3. Thermodynamics

The law of mass action makes important statements about the conditions under which the desired reactions can take place with optimum yields. It may be necessary to employ a catalyst in order to shorten the reaction time; however, what the equilibrium distribution of the reacting components will be is determined simply by the reaction equation and the chemical potentials of the constituents (components) – in the case of ideal gases, by Eq. (3.9.30). The law of mass action has many applications in chemistry and technology. As just one example, we consider here the pressure dependence of the reaction equilibrium. From (3.9.30), it follows that the pressure derivative of K(T, P ) is given by 1 ∂K ∂ log K 1 = =− νi , K ∂P ∂P P i

(3.9.36a)

where ν = i νi is the so called molar excess. From the equation of state of mixtures of ideal gases (Eq. (5.2.3)), P V = kT i Ni , we obtain for the changes ∆V and ∆N which accompany a reaction at constant T and P : P ∆V = kT

∆Ni .

(3.9.37a)

i

Let the number of individual reactions be ∆N , i.e. ∆Ni = νi ∆N , then it follows from (3.9.37a) that −

1 ∆V . νi = − P i kT ∆N

(3.9.37b)

Taking ∆N = L (the Loschmidt/Avagadro number), then νi moles of each component will react and it follows from (3.9.36a) and (3.9.37b) with the gas constant R that 1 ∂K ∆V =− . (3.9.36b) K ∂P RT Furthermore, ∆V = i νi Vmol is the volume change in the course of the reaction proceeding from right to left (for a reaction which is represented in the form (3.9.23)). (The value of the molar volume Vmol is the same for every ideal gas.) According to Eq. (3.9.36b) in connection with (3.9.30), a larger value of K leads to an increase in the concentrations cj with positive νj , i.e. of those substances which are on the left-hand side of the reaction equation. Therefore, from (3.9.36b), a pressure increase leads to a shift of the equilibrium towards the side of the reaction equation corresponding to the smaller volume. When ∆V = 0, the position of the equilibrium depends only upon the temperature, e.g. in the hydrogen chloride reaction H2 + Cl2 2HCl.

3.9 Equilibrium in Multicomponent Systems

155

In a similar manner, one ﬁnds for the temperature dependence of K(T, P ) the result νi hi ∂ log K ∆h = i 2 = . (3.9.38) ∂T RT RT 2 Here, hi is the molar enthalpy of the substance i and ∆h is the change of the overall molar enthalpy when the reaction runs its course one time from right to left in the reaction equation, c.f. problem 3.26. An interesting and technically important application is Haber’s synthesis of ammonia from nitrogen and hydrogen gas: the chemical reaction N2 + 3H2 2NH3

(3.9.39)

is characterized by 1N2 + 3H2 − 2NH3 0 (ν = cN2 c3H2 = K(T, P ) = KP (T ) P −2 . c2NH3

i

νi = 2): (3.9.40)

To obtain a high yield of NH3 , the pressure must be made as high as possible. Sommerfeld:16 “The extraordinary success with which this synthesis is now carried out in industry is due to the complete understanding of the conditions for thermodynamic equilibrium (Haber), to the mastery of the engineering problems connected with high pressure (Bosch), and, ﬁnally, to the successful selection of catalyzers which promote high reaction rates (Mittasch).” Remarks: (i) The partial pressures introduced in Eq. (3.9.32), Pj = cj P , with cj = Nj /N , in accord with the equation of state of a mixture of ideal gases, Eq. (5.2.3), obey the equations V Pj = Nj k T and P = Pi . (3.9.41) i

(This fact is known as Dalton’s Law: the non-interacting gases in the mixture produce partial pressures corresponding to their particle numbers, as if they each occupy the entire available volume.) (ii) Frequently, the law of mass action is expressed in terms of the particle densities ρi = Ni /V : P ρνi i = Kρ (T ) ≡ (kT ) i νi KP (T ) . (3.9.30 ) i

16

A. Sommerfeld, Thermodynamics and Statistical Mechanics: Lectures on Theoretical Physics, Vol. V (Academic Press, New York, 1956), p. 86

156

3. Thermodynamics

(iii) Now we turn to the direction which a reaction will take. If a mixture is initially present with arbitrary densities, the direction in which the reaction will proceed can be read oﬀ the law of mass action. Let ν1 , ν2 , . . . , νs be positive and νs+1 , νs+2 , . . . , νn negative, so that the reaction equation (3.9.24) takes on the form n

νi Ai

|νi |Ai ,

(3.9.24 )

i=s+1

Assume that the product of the particle densities obeys the inequality i

s $

ρνi i

≡

i=1 n $

ρνi i

i=s+1

|νi |

< Kρ (T ) ,

(3.9.42)

ρi

i.e. the system is not in chemical equilibrium. If the chemical reaction proceeds from right to left, the densities on the left will increase, and the fraction in the inequality will become larger. Therefore, in the case (3.9.42), the reaction will proceed from right to left. If, in contrast, the inequality was initially reversed, with a > sign, then the reaction would proceed from left to right. (iv) All chemical reactions exhibit a heat of reaction, i.e. they are accompanied either by heat release (exothermic reactions) or by taking up of heat (endothermic reactions). We recall that for isobaric processes, ∆Q = ∆H, and the heat of reaction is equal to the change in the enthalpy; see the comment following Eq. (3.1.12). The temperature dependence of the reaction equilibrium follows from Eq. (3.9.38). A temperature increase at constant pressure shifts the equilibrium towards the side of the reaction equation where the enthalpy is higher; or, expressed diﬀerently, it leads to a reaction in the direction in which heat is taken up. As a rule, the electronic contribution O (eV) dominates. Thus, at low temperatures, the enthalpy-rich side is practically not present. ∗

3.9.4 Vapor-pressure Increase by Other Gases and by Surface Tension 3.9.4.1 The Evaporation of Water in Air As discussed in detail in Sect. 3.8.1, a single-component system can evaporate only along its vapor-pressure curve P0 (T ), or, stated diﬀerently, only along the vapor-pressure curve are the gaseous and the liquid phases in equilibrium. If an additional gas is present, this means that there is one more degree of freedom in Gibbs’ phase rule, so that a liquid can coexist with its vapor even outside of P0 (T ). Here, we wish to investigate evaporation in the presence of additional gases and in particular that of water under an air atmosphere. To this end

3.9 Equilibrium in Multicomponent Systems

157

we assume that the other gas is dissolved in the liquid phase to only a negligible extent. If the chemical potential of the liquid were independent of the pressure, then the other gas would have no inﬂuence at all on the chemical potential of the liquid; the partial pressure of the vapor would then have to be identical with the vapor pressure of the pure substance – a statement which is frequently made. In fact, the total pressure acts on the liquid, which changes its chemical potential. The resulting increase of the vapor pressure will be calculated here. To begin, we note that

∂µL V (3.9.43) = ∂P T N V is small, owing to the small speciﬁc volume vL = N of the liquid. When the pressure is changed by ∆P , the chemical potential of the liquid changes according to

µL (T, P + ∆P ) = µL (T, P ) + vL ∆P + O(∆P 2 ) .

(3.9.44)

From the Gibbs–Duhem relation, the chemical potential of the liquid is µL = eL − T sL + P vL .

(3.9.45)

Here, eL and sL refer to the internal energy and the entropy per particle. When we can neglect the temperature and pressure dependence of eL , sL , and vL , then (3.9.44) is valid with no further corrections. The chemical potential of the vapor, assuming an ideal mixture17 , is µvapor (T, P ) = µ0 (T ) + kT log cP ,

(3.9.46) N

where c is the concentration of the vapor in the gas phase, c = Nothervapor +Nvapor . The vapor-pressure curve P0 (T ) without additional gases follows from µL (T, P0 ) = µ0 (T ) + kT log P0 .

(3.9.47)

With an additional gas, the pressure is composed of the pressure of the other gas Pother and the partial pressure of the vapor, Pvapor = cP ; all together, P = Pother + Pvapor . Then the equality of the chemical potentials in the liquid and the gaseous phases is expressed by µL (T, Pother + Pvapor ) = µ0 (T ) + kT log Pvapor . Subtracting (3.9.47) from this, we ﬁnd 17

See Sect. 5.2

158

3. Thermodynamics

Pvapor − P0 +1 P0 Pvapor − P0 vL (Pother + Pvapor − P0 ) ≈ kT P0

µL (T, Pother + Pvapor ) − µL (T, P0 ) = kT log

kT vL Pother = − vL (Pvapor − P0 ) P0 vL Pother vL Pvapor − P0 = = (P − Pvapor ) . vG − vL vG − vL

(3.9.48)

From the second term in Eq. 3.9.48, it follows that the increase in vapor pressure is given approximately by Pvapor − P0 (T ) ≈ vvGL Pother , and the exact expression is found to be Pvapor = P0 (T ) +

vL (P − P0 (T )) . vG

(3.9.49)

The partial pressure of the vapor is increased relative to the vapor-pressure curve by vvGL × (P − P0 (T )). Due to the smallness of the factor vvGL , the partial pressure is still to a good approximation the same as the vapor pressure at the temperature T . The most important result of these considerations is the following: while a liquid under the pressure P at the temperature T is in equilibrium with its vapor phase only for P = P0 (T ); that is, for P > P0 (T ) (or at temperatures below its boiling point) it exists only in liquid form, it is also in equilibrium in this region of (P, T ) with its vapor when dissolved in another gas. We now discuss the evaporation of water or the sublimation of ice under an atmosphere of air, see Fig. 3.39. The atmosphere predetermines a particular pressure P . At each temperature T below the evaporation temperature determined by this pressure (P > P0 (T )), just enough water evaporates to make its partial pressure equal that given by (3.9.49) (recall Pvapor = cP ). The concentration of the water L (P − P0 (T )))/P . vapor is c = (P0 (T ) + vvG In a free air atmosphere, the water vapor is transported away by diﬀusion or by convection (wind), and more and more water must evaporate (vaporize).18 On

Fig. 3.39. The vapor pressure Pvapor lies above the vapor-pressure curve P0 (T ) (dot-dashed curve) 18

As already mentioned, the above considerations are also applicable to sublimation. When one cools water at 1 atm below 0◦ C, it freezes to ice. This ice at

3.9 Equilibrium in Multicomponent Systems

159

increasing the temperature, the partial pressure of the water increases, until ﬁnally it is equal to P . The vaporization which then results is called boiling. For P = P0 (T ), the liquid is in equilibrium with its pure vapor. Evaporation then occurs not only at the liquid surface, but also within the liquid, in particular at the walls of its container. There, bubbles of vapor are formed, which then rise to the surface. Within these vapor bubbles, the vapor pressure is P0 (T ), corresponding to the temperature T . Since the vapor bubbles within the liquid are also subject to the hydrostatic pressure of the liquid, their temperature must in fact be somewhat higher than the boiling point under atmospheric pressure. If the liquid contains nucleation centers (such as the fat globules in milk), at which vapor bubbles can form more readily than in the pure liquid, then it will “boil over”. The increase in the vapor pressure by increased external pressure, or as one might say, by ‘pressing on it’, may seem surprising. The additional pressure causes an increase in the release of molecules from the liquid, i.e. an increase in the partial pressure.

3.9.4.2 Vapor-Pressure Increase by Surface Tension of Droplets A further additional pressure is due to the surface tension and plays a role in the evaporation of liquid droplets. We consider a liquid droplet of radius r. When the radius is increased isothermally by an amount dr, the surface area increases by 8πr dr, which leads to an energy increase of σ8πr dr, where σ is the surface tension. Owing to the pressure diﬀerence p between the pressure within the droplet and the pressure of the surrounding atmosphere, there is a force p 4πr2 which acts outwardly on the surface. The total change of the free energy is therefore dF = δA = σ8πr dr − p 4πr2 dr .

(3.9.50)

In equilibrium, the free energy of the droplet must be stationary, so that for the pressure diﬀerence we ﬁnd the following dependence on the radius: p=

2σ . r

(3.9.51)

Thus, small droplets have a higher vapor pressure than larger one. The vaporpressure increase due to the surface tension is from Eq. (3.9.48) now seen to be Pvapor − P0 (T ) =

2σ vL r vG − vL

(3.9.52)

inversely proportional to the radius of the droplet. In a mixture of small and large droplets, the smaller ones are therefore consumed by the larger ones. e.g. −10◦ C is to be sure as a single-component system not in equilibrium with the gas phase, but rather with the water vapor in the atmosphere at a partial pressure of about P0 (−10◦ C), where P0 (T ) represents the sublimation curve. For this reason, frozen laundry dries, because ice sublimes in the atmosphere.

160

3. Thermodynamics

Remarks: (i) Small droplets evaporate more readily than liquids with a ﬂat surface, and conversely condensation occurs less easily on small droplets. This is the reason why extended solid cooled surfaces promote the condensation of water vapor more readily than small droplets do. The temperature at which the condensation of water from the atmosphere onto extended surfaces (dew formation) takes place is called the dew point. It depends on the partial pressure of water vapor in the air, i.e. its degree of saturation, and can be used to determine the amount of moisture in the air. (ii) We consider the homogeneous condensation of a gas in free space without surfaces. The temperature of the gas is taken to be T and the vapor pressure at this temperature to be P0 (T ). We assume that the pressure P of the gas is greater than the vapor pressure; it is then referred to as supersaturated vapor. For each degree of supersaturation, then, a critical radius can be deﬁned from (3.9.52): rcr =

2σ vL . vG (P − P0 (T ))

For droplets whose radius is smaller than rcr the vapor is not supersaturated. Condensation can therefore not take place through the formation of very small droplets, since their vapor pressures would be higher than P . Some critical droplets must be formed through ﬂuctuations in order that condensation can be initiated. Condensation is favored by additional attractive forces; for example, in the air, there are always electrically-charged dust particles and other impurities present, which as a result of their electrical forces promote condensation, i.e. they act as nucleation centers for condensation.

Problems for Chapter 3 3.1 Read oﬀ the partial derivatives of the internal energy E with respect to its natural variables from Eq. (3.1.3). 3.2 Show that x δg = αdx + β dy y is not an exact diﬀerential: a) using the integrability conditions and b) by integration from P1 to P2 along the paths C1 and C2 . Show that 1/x is an integrating factor, df = δg/x.

3.3 Prove the chain rule (3.2.13) for Jacobians. 3.4 Derive the following relations: CP κT = CV κS

„

,

∂T ∂V

«

=− S

T CV

„

∂P ∂T

«

„ and V

∂T ∂P

« = S

T CP

„

∂V ∂T

« . P

Problems for Chapter 3

161

Fig. 3.40. Paths in the x-y diagram

3.5 Determine the work performed by an ideal gas, W (V ) =

RV

dV P during a V1 reversible adiabatic expansion. From δQ = 0, it follows that dE = −P dV , and from ` ´2/3 and this the adiabatic equations for an ideal gas can be obtained: T = T1 VV1 V

2/3

P = N kT1 V15/3 . They can be used to determine the work performed.

3.6 Show that the stability conditions (3.6.48a,b) follow from the maximalization of the entropy. 3.7 One liter of an ideal gas expands reversibly and isothermally at (20◦ C) from an initial pressure of 20 atm to 1 atm. How large is the work performed in Joules? What quantity of heat Q in calories must be transferred to the gas?

3.8 Show that the ratio of the entropy increase on heating of an ideal gas from T1 to T2 at constant pressure to that at constant volume is given by the ratio of the speciﬁc heats.

3.9 A thermally insulated system is supposed to consist of 2 subsystems (TA , VA , P ) and (TB , VB , P ), which are separated by a movable, diathermal piston (Fig. 3.41(a). The gases are ideal. (a) Calculate the entropy change accompanying equalization of the temperatures (irreversible process). (b) Calculate the work performed in a quasistatic temperature equalization; cf. Fig. 3.41(b).

(a)

(b)

Fig. 3.41. For problem 3.9

3.10 Calculate the work obtained, W =

H

P dV , in a Carnot cycle using an ideal

gas, by evaluating the ring integral.

3.11 Compare the cooling eﬃciency of a Carnot cycle between the temperatures T1 and T2 with that of two Carnot cycles operating between T1 and T3 and between T3 and T2 (T1 < T3 < T2 ). Show that it is more favorable to decompose a cooling process into several smaller steps.

162

3. Thermodynamics

3.12 Discuss a Carnot cycle in which the working ‘substance’ is thermal radiation. For this case, the following relations hold: E = σV T 4 , pV = 13 E, σ > 0. (a) Derive the adiabatic equation. (b) Compute CV and CP .

3.13 Calculate the eﬃciency of the Joule cycle (see Fig. 3.42): Result : η = 1 − (P2 /P1 )(κ−1)/κ . Compare this eﬃciency with that of the Carnot cycle (drawn in dashed lines), using an ideal gas as working substance.

Fig. 3.42. The Joule cycle

3.14 Calculate the eﬃciency of the Diesel cycle (Fig. 3.43) Result: η =1−

1 (V2 /V1 )κ − (V3 /V1 )κ . κ (V2 /V1 ) − (V3 /V1 )

Fig. 3.43. The Diesel cycle

3.15 Calculate for an ideal gas the change in the internal energy, the work performed, and the quantity of heat transferred for the quasistatic processes along the following paths from 1 to 2 (see Fig. 3.44) (a) 1-A-2 (b) 1-B-2 (c) 1-C-2. What is the shape of the E(P, V ) surface?

Fig. 3.44. For problem 3.15

Problems for Chapter 3

163

3.16 Consider the socalled Stirling cycle, where a heat engine (with an ideal gas as working substance) performs work according to the following quasistatic cycle: (a) isothermal expansion at the temperature T1 from a volume V1 to a volume V2 . (b) cooling at constant volume V2 from T1 to T2 . (c) isothermal compression at the temperature T2 from V2 to V1 . (d) heating at constant volume from T2 to T1 . Determine the thermal eﬃciency η of this process! 3.17 The ratio of the speciﬁc volume of water to that of ice is 1.000:1.091 at 0◦ C and 1 atm. The heat of melting is 80 cal/g. Calculate the slope of the melting curve. 3.18 Integrate the Clausius–Clapeyron diﬀerential equation for the transition liquid-gas, by making the simplifying assumption that the heat of transition is constant, Vliquid can be neglected in comparison to Vgas , and that the equation of state for ideal gases is applicable to the gas phase. 3.19 Consider the neighborhood of the triple point in a region where the limiting curves can be approximated as straight lines. Show that α < π holds (see Fig. 3.45). Hint: Use dP/dT = ∆S/∆V , and the fact that the slope of line 2 is greater than that of line 3.

Fig. 3.45. The vicinity of a triple point

3.20 The latent heat of ice per unit mass is QL . A container holds a mixture of water and ice at the freezing point (absolute temperature T0 ). An additional amount of the water in the container (of mass m) is to be frozen using a cooling apparatus. The heat output from the cooling apparatus is used to heat a body of heat capacity C and initial temperature T0 . What is the minimum quantity of heat energy transferred from the apparatus to the body? (Assume C to be temperature independent). 3.21 (a) Discuss the pressure dependence of the reaction N2 +3H2 2NH3 (ammonia synthesis). At what pressure is the yield of ammonia greatest? (b) Discuss the thermal dissociation 2H2 O 2H2 +O2 . Show that an increase in pressure works against the dissociation. 3.22 Give the details of the derivation of Eqs. (3.9.36a) and (3.9.36b). 3.23 Discuss the pressure and temperature dependence of the reaction CO + 3H2 CH4 + H2 O .

3.24 Apply the law of mass action to the reaction H2 +Cl2 2HCl.

164

3. Thermodynamics

3.25 Derive the law of mass action for the particle densities ρj = Nj /V

(Eq. (3.9.30 )) .

3.26 Prove Eq. (3.9.38) for the temperature dependence of the mass-action constant. `G´ ∂ = T 2 ∂T Hint: Show that H = G − T ∂G ∂T T and express the change in the free enthalpy X µi νi ∆G = i

using Eq. (3.9.28), then insert the law of mass action (3.9.30) or (3.9.30’).

3.27 The Pomeranchuk eﬀect. The entropy diagram for solid and liquid He3 has the shape shown below 3 K. Note that the speciﬁc volumes of both phases do not change within this temperature range. Draw P (T ) for the coexistence curves of the phases.

Fig. 3.46. The Pomeranchuk eﬀect

3.28 The (speciﬁc) free energies fα and fβ of two solid phases α and β with a miscibility gap and the (speciﬁc) free energy fL of the liquid mixture are shown as functions of the concentration c in Fig. 3.47. Discuss the meaning of the dashed and solid double tangents. On lowering the temperature, the free energy of the liquid phase is increased, i.e. fL is shifted upwards relative to the two ﬁxed branches of the free energy. Derive from this the shape of the eutectic phase diagram.

Fig. 3.47. Liquid mixture

Problems for Chapter 3

165

3.29 A typical shape for the phase diagram of liquid and gaseous mixtures is shown in Fig. 3.48. The components A and B are completely miscible in both the gas phase and the liquid phase. B has a higher boiling point than A. At a temperature in the interval TA < T < TB , the gas phase is therefore richer in A than the liquid phase. Discuss the boiling process for the initial concentration c0 (a) in the case that the liquid remains in contact with the gas phase: show that vaporization takes place in the temperature interval T0 to Te . (b) in the case that the vapor is pumped oﬀ: show that the vaporization takes place in the interval T0 to TB .

Fig. 3.48. Bubble point and dew point lines

Remark: The curve which is made by the boiling curve (evaporation limit) and the condensation curve together form the bubble point and dew point lines, a lensshaped closed curve. Its shape is of decisive importance for the eﬃciency of distillation processes. This ‘boiling lens’ can also take on much more complex shapes than in Fig. 3.48, such as e.g. that shown in Fig. 3.49. A mixture with the concentration ca is called azeotropic. For this concentration, the evaporation of the mixture occurs exactly at the temperature Ta and not in a temperature interval. The eutectic concentration is also special in this sense. Such a point occurs in an alcohol-water mixture at 96%, which limits the distillation of alcohol.19

Fig. 3.49. Bubble point and dew point lines 19

Detailed information about phase diagrams of mixtures can be found in M. Hansen, Constitution of Binary Alloys, McGraw Hill, 1983 und its supplements. Further detailed discussions of the shape of phase diagrams are to be found in L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics, Vol. V, Statistical Physics, Pergamon Press 1980.

166

3. Thermodynamics

3.30 The free energy of the liquid phase, fL , is drawn in Fig. (3.50) as a function of the concentration, as well as that of the gas phase, fG . It is assumed that fL is temperature independent and fG shifts upwards with decreasing temperature (Fig. 3.50). Explain the occurrence of the ‘boiling lens’ in problem 3.29.

Fig. 3.50. Free energy

3.31 Consider the production of electron-positron pairs, e+ + e− γ . Assume for simplicity that the chemical potential of the electrons and positrons is given in the nonrelativistic limit, taking the rest energy into account, by µ = 3 mc2 + kT log λVN : Show that for the particle number densities n± of e± that n+ n− = λ−6 e−

2mc2 kT

holds and discuss the consequences.

3.32 Consider the boiling and condensation curves of a two-component liquid mixture. Take the concentrations in the gaseous and liquid phases to be cG and cL . Show that at the points where cG = cL (the azeotropic mixture) i.e. where the boiling and condensation curves come together, for a ﬁxed pressure P the following relation holds: dT =0, dc and for ﬁxed T dP =0, dc thus the slopes are horizontal. Method: Start from the diﬀerential Gibbs-Duhem relations for the gas and the liquid phases along the limiting curves.

Problems for Chapter 3

167

3.33 Determine the temperature of the atmosphere as a function of altitude. How much does the temperature decrease per km of altitude? Compare your result for the pressure P (z) with the barometric formula (see problem 2.15). Method: Start with the force balance on a small volume of air. That gives dP (z) = −mg P (z)/k · T (z) . dz Assume that the temperature changes depend on the pressure changes of the air dP (z) (z) = γ−1 . From this, one ﬁnds dTdz(z) . Numerical (ideal gas) adiabatically dT T (z) γ P values: m = 29 g/mole, γ = 1.41.

3.34 In meteorology, the concept of a “homogeneous atmosphere” is used, where ρ is taken to be constant. Determine the pressure and the temperature in such an atmosphere as functions of the altitude. Calculate the entropy of the homogeneous atmosphere and compare it with that of an isothermal atmosphere with the same energy content. Could such a homogeneous atmosphere be stable?

4. Ideal Quantum Gases

In this chapter, we want to derive the thermodynamic properties of ideal quantum gases, i.e. non-interacting particles, on the basis of quantum statistics. This includes nonrelativistic fermions and bosons whose interactions may be neglected, quasiparticles in condensed matter, and relativistic quanta, in particular photons.

4.1 The Grand Potential The calculation of the grand potential is found to be the most expedient way to proceed. In order to have a concrete system in mind, we start from the Hamiltonian for N non-interacting, nonrelativistic particles, H=

N 1 2 p . 2m i i=1

(4.1.1)

We assume the particles to be enclosed in a cube of edge length L and volume V = L3 , and apply periodic boundary conditions. The single-particle eigenfunctions of the Hamiltonian are then the momentum eigenstates |p and are given in real space by 1 ϕp (x) = x|p = √ eip·x/ , V

(4.1.2a)

where the momentum quantum numbers can take on the values p=

2π (ν1 , ν2 , ν3 ) , L

να = 0, ±1, . . . ,

(4.1.2b)

and the single-particle kinetic energy is given by εp =

p2 . 2m

(4.1.2c)

For the complete characterization of the single-particle states, we must still take the spin s into account. It is integral for bosons and half-integral for fermions. The quantum number ms for the z-component of the spins has

170

4. Ideal Quantum Gases

2s+1 possible values. We combine the two quantum numbers into one symbol, p ≡ (p, ms ) and ﬁnd for the complete energy eigenstates |p ≡ |p |ms .

(4.1.2d)

In the treatment which follows, we could start from arbitrary noninteracting Hamiltonians, which can also contain a potential and can depend on the spin, as is the case for electrons in a magnetic ﬁeld. We then still denote the single-particle quantum numbers by p and the eigenvalue belonging to the energy eigenstate |p by εp , but it need no longer be the same as (4.1.2c). These states form the basis of the N -particle states for bosons and fermions: |p1 , p2 , . . . , pN = N (±1)P P |p1 . . . |pN . (4.1.3) P

Here, the sum runs over all the permutations P of the numbers 1 to N . The upper sign holds for bosons, (+1)P = 1, the lower sign for fermions. (−1)P is equal to 1 for even permutations and −1 for odd permutations. The bosonic states are completely symmetric, the fermionic states are completely antisymmetric. As a result of the symmetrization operation, the state (4.1.3) is completely characterized by its occupation numbers np , which indicate how many of the N particles are in the state |p. For bosons, np = 0, 1, 2, . . . can assume all integer values from 0 to ∞. These particles are said to obey Bose– Einstein statistics. For fermions, each single-particle state can be occupied at most only once, np = 0, 1 (identical quantum numbers would yield zero due to the antisymmetrization on the right-hand side of (4.1.3)). Such particles are said to obey Fermi–Dirac statistics. The normalization factor in (4.1.3) is N = √1N ! for fermions and N = (N ! np1 ! np2 ! . . .)−1/2 for bosons.1 For an N -particle state, the sum of all the np obeys N= np , (4.1.4) p

and the energy eigenvalue of this N -particle state is np ε p . E({np }) =

(4.1.5)

p

We can now readily calculate the grand partition function (Sect. 2.7.2): 1

Note: for bosons, the state (4.1.3) can also be written in the form P (N !/np1 ! np2 ! . . .)−1/2 P P |p1 . . . |pN , where the sum includes only those permutations P which lead to diﬀerent terms.

4.1 The Grand Potential

ZG ≡

∞

e−β(E({np })−µN ) =

N =0 P {np } p np =N

=

p

e−β(εp −µ)np

np

e−β

171

P

p (εp −µ)np

{np }

⎧ 1 ⎪ for bosons ⎪ ⎨ −β(ε p −µ) 1−e p (4.1.6) = ⎪ ⎪ 1 + e−β(εp −µ) for fermions . ⎩ p

We give here some explanations relevant to (4.1.6). Here, {np } . . . ≡ $ . . . refers to the multiple sum over all occupation numbers, whereby p np each occupation number np takes on the allowed values (0,1 for fermions and 0,1,2, . . . for bosons). In this expression, p ≡ (p, ms ) runs over all values of p and ms . The calculation of the grand partition function requires that one ﬁrst sum over all the states allowed by a particular value of the particle number N , and 2, . . .. In the deﬁ then over all particle numbers, N = 0, 1, nition of ZG , {np } therefore enters with the constraint p np = N . Since however in the end we must sum over all N , the expression after the second equals sign is obtained; in it, the sum runs over all np independently of one another. Here, we see that it is most straightforward to calculate the grand partition function as compared to the other ensembles. For bosons, a product of geometric series is obtained in (4.1.6); the condition for their convergence requires that µ < εp for all p. The grand potential follows from (4.1.6): (4.1.7) log 1 ∓ e−β(εp −µ) , Φ = −β −1 log ZG = ±β −1 p

from which we can derive all the thermodynamic quantities of interest. Here, and in what follows, the upper (lower) signs refer to bosons (fermions). For the average particle number, we therefore ﬁnd

∂Φ N ≡− = n(εp ) , (4.1.8) ∂µ β p where we have introduced 1 ; (4.1.9) n(εp ) ≡ β(ε −µ) e p ∓1 these are also referred to as the Bose or the Fermi distribution functions. We now wish to show that n(εq ) is the average occupation number of the state |q. To this end, we calculate the average value of nq : P −β p np (εp −µ) −βnq (εq −µ) nq nq {np } e nq e P nq = Tr(ρG nq ) = = −β p np (εp −µ) −βn (ε −µ) q q nq e {np } e ∂ log =− e−xn = n(εq ) , ∂x n

x=β(εq −µ)

172

4. Ideal Quantum Gases

which demonstrates the correctness of our assertion. We now return to the calculation of the thermodynamic quantities. For the internal energy, we ﬁnd from (4.1.7)

∂(Φβ) E= = εp n(εp ) , (4.1.10) ∂β βµ p where in taking the derivative, the product βµ is held constant. Remarks: (i) In order to ensure that n(εp ) ≥ 0 for every value of p, for bosons we require that µ < 0 , and for an arbitrary energy spectrum, that µ < min(εp ). (ii) For e−β(εp −µ) 1 and s = 0, we obtain from (4.1.7) 2 z V zV −1 −β(εp −µ) d3 p e−βp /2m = − 3 , e =− Φ = −β 3 β (2π) βλ p (4.1.11) which is identical to the grand potential of a classical ideal gas, Eq. (2.7.23). Here, the dispersion relation εp = p2 /2m from Eq. (4.1.2c) was used for the right-hand side of (4.1.11). In z = eβµ ,

(4.1.12) √ h 2πmkT

(Eq. (2.7.20)) denotes we have introduced the fugacity, and λ = the thermal wavelength. For s = 0, an additional factor of (2s + 1) would occur after the second and third equals signs in Eq. (4.1.11). (iii) The calculation of the grand partition function becomes even simpler if we make use of the second-quantization formalism ˆ) , ZG = Tr exp −β(H − µN (4.1.13a) where the Hamiltonian and the particle number operator in second quantization2 have the form εp a†p ap (4.1.13b) H= p

and ˆ= N

a†p ap .

(4.1.13c)

p

It then follows that † ZG = Tr e−β(εp −µ)ap ap = e−β(εp −µ)np p

p

(4.1.13d)

np

and thus we once again obtain (4.1.6). 2

See e.g. F. Schwabl, Advanced Quantum Mechanics, 3rd ed. (QM II), Springer, 2005, Chapter 1.

4.1 The Grand Potential

173

According to Eq. (4.1.2b) we may associate with each of the discrete p values a volume element of size ∆ = 2π/L3 . Hence, sums over p may be replaced by integrals in the limit of large V . For the Hamiltonian of free particles (4.1.1), this implies in (4.1.7) and (4.1.8) 1 V d3 p . . . ... = g ... = g ∆... = g (4.1.14a) 3 ∆ (2π) p p p with the degeneracy factor g = 2s + 1 ,

(4.1.14b)

as a result of the spin-independence of the single-particle energy εp . For the average particle number, we then ﬁnd from (4.1.8)3 N=

gV (2π)3

d3 p n(εp ) =

gV 2π 2 3

∞ dp p2 n(εp ) 0

gV m3/2 = 1/2 2 3 2 π

∞ 0

√ ε , eβ(ε−µ) ∓ 1 dε

(4.1.15)

where we have introduced ε = p2 /2m as integration variable. We also deﬁne the speciﬁc volume v = V /N

(4.1.16)

and substitute x = βε, ﬁnally obtaining from (4.1.15) 1 1 2g = 3√ v λ π

∞ 0

g x1/2 = 3 dx x −1 e z ∓1 λ

"

g3/2 (z) for bosons f3/2 (z) for fermions .

(4.1.17)

In this expression, we have introduced the generalized ζ-functions, which are deﬁned by4 gν (z) fν (z)

#

1 ≡ Γ (ν)

∞ dx

xν−1 . ∓1

(4.1.18)

ex z −1

0

Similarly, from (4.1.7), we ﬁnd 3

4

For bosons, we shall see in Sect. 4.4 that in a temperature range where µ → 0, the term with p = 0 must be treated separately in making the transition from the sum over momenta to the integral. R∞ The gamma function is deﬁned as Γ (ν) = dt e−t tν−1 [Re ν > 0]. It obeys the 0

relation Γ (ν + 1) = ν Γ (ν).

174

4. Ideal Quantum Gases

Φ=±

gV (2π)3 β

gV m3/2 = ± 1/2 2 3 2 π β

d3 p log 1 ∓ e−β(εp −µ) ∞ dε

√ ε log 1 ∓ e−β(ε−µ) ,

(4.1.19)

0

which, after integration by parts, leads to 2 gV m3/2 Φ = −P V = − 3 21/2 π 2 3

∞ 0

dε ε3/2 gV kT =− 3 λ eβ(ε−µ) ∓ 1

"

g5/2 (z) , (4.1.19 ) f5/2 (z)

where the upper lines holds for bosons and the lower line for fermions. The expression (3.1.26), Φ = −P V , which is valid for homogeneous systems, was also used here. From (4.1.10) we obtain for the internal energy gV E= (2π)3

gV m3/2 d p εp n(εp ) = 1/2 2 3 2 π

∞

3

dε ε3/2 . ∓1

eβ(ε−µ) 0

(4.1.20)

Comparison with (4.1.19 ) yields, remarkably, the same relation PV =

2 E 3

(4.1.21)

as for the classical ideal gas. Additional general relations follow from the homogeneity of Φ in T and µ. From (4.1.19 ), (4.1.15), and (3.1.18), we obtain µ µ Φ = −T 5/2 ϕ , N = V T 3/2 n , V T T

µ S s(µ/T ) ∂Φ , and = . = V T 3/2 s S=− ∂T V,µ T N n(µ/T )

P =−

(4.1.22a,b) (4.1.22c,d)

Using these results, we can readily derive the adiabatic equation. The conditions S = const. and N = const., together with (4.1.22d), (4.1.22b) and (4.1.22a), yield µ/T = const., V T 3/2 = const., P T −5/2 = const., and ﬁnally P V 5/3 = const .

(4.1.23)

The adiabatic equation has the same form as that for the classical ideal gas, although most of the other thermodynamic quantities show diﬀerent behavior, such as for example cP /cV = 5/3. Following these preliminary general considerations, we wish to derive the equation of state from (4.1.22a). To this end, we need to eliminate µ/T from (4.1.22a) and replace it by the density N/V using (4.1.22b). The explicit computation is carried out in 4.2 for the classical limit, and in 4.3 and 4.4 for low temperatures where quantum eﬀects predominate.

4.2 The Classical Limit z = eµ/kT 1

175

4.2 The Classical Limit z = eµ/kT 1 We ﬁrst formulate the equation of state in the nearly-classical limit. To do this, we expand the generalized ζ-functions g and f deﬁned in (4.1.18) as power series in z: gν (z) fν (z)

#

1 = Γ (ν)

∞

dx xν−1 e−x z

0

∞

(±1)k e−xk z k =

k =0

∞ (±1)k+1 z k k=1

kν

,

(4.2.1) where the upper lines (signs) hold for bosons and the lower for fermions. Then Eq. (4.1.17) takes on the form (±1)k+1 z k 3 λ3 z2 =g . = g z ± + O z v k 3/2 23/2 ∞

(4.2.2)

k=1

This equation can be solved iteratively for z: ) *

3 2 3 1 λ λ3 λ3 ∓ 3/2 . +O z= vg 2 vg v

(4.2.3)

Inserting this in the series for Φ which follows from (4.1.19 ) and (4.2.1),

z2 gV kT z ± 5/2 + O z 3 , (4.2.4) Φ=− 3 λ 2 we can eliminate µ in favor of N and obtain the equation of state ) 2 ** ) λ3 λ3 . P V = −Φ = N kT 1 ∓ 5/2 + O v 2 gv

(4.2.5)

The symmetrization (antisymmetrization) of the wavefunctions causes a reduction (increase) in the pressure in comparison to the classical ideal gas. This acts like an attraction (repulsion) between the particles, which in fact are non-interacting (formation of clusters in the case of bosons, exclusion principle for fermions). For the chemical potential, we ﬁnd from (4.1.12) and 3 (4.2.3), and making use of λvg 1, the following expansion: 1 λ3 λ3 ∓ 3/2 ... , (4.2.6) µ = kT log z = kT log gv 2 gv i.e. µ < 0. Furthermore, for the free energy F = Φ + µN , we ﬁnd from (4.2.5) and (4.2.6) F = Fclass ∓ kT

N λ3 , 25/2 gv

(4.2.7a)

176

4. Ideal Quantum Gases

where Fclass = N kT

λ3 −1 + log gv

(4.2.7b)

is the free energy of the classical ideal gas. Remarks: (i) The quantum corrections are proportional to 3 , since λ is proportional to . These corrections are also called exchange corrections, as they depend only on the symmetry behavior of the wavefunctions (see also Appendix B). (ii) The exchange corrections to the classical results at ﬁnite temperatures are of the order of λ3 /v. The classical equation of state holds for z 1 or λ v 1/3 , i.e. in the extremely dilute limit. This limit is the more readily reached, the higher the temperature and the lower the density. The occupation number in the classical limit is given by (cf. Fig. 4.1) n(εp ) ≈ e−βεp eβµ = e−βεp

λ3 1. gv

(4.2.8)

This classical limit (4.2.8) is equally valid for bosons and fermions. For comparison, the Fermi distribution at T = 0 is also shown. Its signiﬁcance, as well as that of εF , will be discussed in Sect. 4.3 (Fig. 4.1). (iii) Corresponding to the symmetry-dependent pressure change in (4.2.5), the exchange eﬀects lead to a modiﬁcation of the free energy (4.2.7a).

Fig. 4.1. The occupation number n(ε) in the classical limit (shaded). For comparison, the occupation of a degenerate Fermi gas is also indicated

4.3 The Nearly-degenerate Ideal Fermi Gas In this and the following section, we consider the opposite limit, in which quantum eﬀects are predominant. Here, we must treat fermions and bosons separately in Sect. 4.4. We ﬁrst recall the properties of the ground state of fermions, independently of their statistical mechanics.

4.3 The Nearly-degenerate Ideal Fermi Gas

177

4.3.1 Ground State, T = 0 (Degeneracy) We ﬁrst deal with the ground state of a system of N fermions. It is obtained at a temperature of zero Kelvin. In the ground state, the N lowest singleparticle states |p are each singly occupied. If the energy depends only on the momentum p, every value of p occurs g-fold. For the dispersion relation (4.1.2c), all the momenta within a sphere (the Fermi sphere), whose radius is called the Fermi momentum pF (Fig. 4.2), are thus occupied. The particle number is related to pF as follows: V gV p3 N =g d3 p Θ(pF − p) = 2 F3 . 1=g (4.3.1) 3 (2π) 6π p≤pF

Fig. 4.2. The occupation of the momentum states within the Fermi sphere

From (4.3.1), we ﬁnd the following relation between the particle density n= N V and the Fermi momentum:

pF =

6π 2 g

1/3 n1/3 .

(4.3.2)

The single-particle energy corresponding to the Fermi momentum is called the Fermi energy: εF =

p2F = 2m

6π 2 g

2/3

2 2/3 n . 2m

(4.3.3)

For the ground-state energy, we ﬁnd E=

gV (2π)3

d3 p

3 p2 gV p5F Θ(pF − p) = = εF N . 2 3 2m 20π m 5

(4.3.4)

178

4. Ideal Quantum Gases

From (4.1.21) and (4.3.4), the pressure of fermions at T = 0 is found to be P =

2 1 εF n = 5 5

6π 2 g

2/3

2 5/3 n . m

(4.3.5)

The degeneracy of the ground state is suﬃciently small that the entropy and the product T S vanish at T = 0 (see also (4.3.19)). From this, and using (4.3.4) and (4.3.5), we obtain for the chemical potential using the Gibbs– Duhem relation µ = N1 (E + P V − T S): µ = εF .

(4.3.6)

This result is also evident from the form of the ground state, which implies the occupation of all the levels up to the Fermi energy, from which it follows that the Fermi distribution of a system of N fermions at T = 0 becomes n(ε) = Θ(εF − ε). Clearly, one requires precisely the energy εF in order to put one additional fermion into the system. The existence of the Fermi energy is a result of the Pauli principle and is thus a quantum eﬀect. 4.3.2 The Limit of Complete Degeneracy We now calculate the thermodynamic properties in the limit of large µ/kT . In Fig. 4.3, the Fermi distribution function n(ε) =

1 e(ε−µ)/kT

+1

(4.3.7)

is shown for low temperatures. In comparison to a step function at the position µ, it is broadened within a region kT . We shall see below that µ is equal to εF only at T = 0. For T = 0, the Fermi distribution function degenerates into a step function, so that one then speaks of a degenerate Fermi gas; at low T one refers to a nearly-degenerate Fermi gas. It is expedient to replace the prefactors in (4.1.19 ) and (4.1.15) with the Fermi energy (4.3.3)5 ; for the grand potential, one then obtains Φ=

−3/2 −N εF

∞ dε ε3/2 n(ε) ,

(4.3.8)

0

and the formula for N becomes 3 −3/2 1 = εF 2

∞ dε ε1/2 n(ε) .

(4.3.9)

0 5

In (4.3.8) and (4.3.14), Φ is expressed as usual in terms of its natural variables −3/2 ∝ V . In (4.3.14 ), the dependence on µ has been T, V and µ, since N εF substituted by T and N/V , using (4.3.13).

4.3 The Nearly-degenerate Ideal Fermi Gas

179

Fig. 4.3. The Fermi distribution function n(ε) for low temperatures, compared with the step function Θ(µ − ε).

Fig. 4.4. The Fermi distribution function n(ε), and n(ε) − Θ(µ − ε).

There thus still remain integrals of the type ∞ I=

dε f (ε) n(ε)

(4.3.10)

0

to be computed. The method of evaluation at low temperatures was given by Sommerfeld; I can be decomposed in the following manner: µ

∞ dε f (ε) +

I=

dε f (ε) n(ε) − Θ(µ − ε)

0

0

µ

∞

≈

dε f (ε) + 0

dε f (ε) n(ε) − Θ(µ − ε)

(4.3.11)

−∞

and for T → 0, the limit of integration in the second term can be extended to −(µ−ε)/kT 6 −∞ to a good approximation, since for negative ). ε , n(ε) = 1+O(e One can see immediately from Fig. 4.4 that n(ε)−Θ(µ−ε) diﬀers from zero only in the neighborhood of ε = µ and is antisymmetric around µ.7 Therefore, 6

7

If f (ε) is in principle deﬁned only for positive ε, one can e.g. deﬁne f (−ε) = f (ε); the result depends on f (ε) only for positive h ε. i 1 1 1 − Θ(−x) = 1 − − Θ(−x) = − − Θ(x) . ex +1 e−x +1 e−x +1

180

4. Ideal Quantum Gases

we expand f (ε) around the value µ in a Taylor series and introduce a new integration variable, x = (ε − µ)/kT : µ I=

∞ dε f (ε) + −∞

0

dx

1 − Θ(−x) × ex + 1

2 f (µ) 4 3 × f (µ) kT x + kT x + . . . 3! µ ∞ 2 x + = dε f (ε) + 2 kT f (µ) dx x e +1 0

0

4 ∞ 2 kT x3 f (µ) dx x + ... + 3! e +1 0

(since ex1+1 − Θ(−x) is antisymmetric and = ex1+1 for x > 0). From this, the general expansion in terms of the temperature follows, making use of the integrals computed in Appendix D., Eq. (D.7) 8 µ I=

dε f (ε) +

π 2 2 7π 4 4 kT f (µ) + kT f (µ) + . . . . 6 360

(4.3.12)

0

Applying this expansion to Eq. (4.3.9), we ﬁnd

3/2 "

2 # µ π 2 kT 1= 1+ . + O T4 εF 8 µ This equation can be solved iteratively for µ, yielding the chemical potential as a function of T and N/V :

" 2 # π 2 kT µ = εF 1 − , (4.3.13) + O T4 12 εF where εF is given by (4.3.3). The chemical potential decreases with increasing temperature, since then no longer all the states within the Fermi sphere are occupied. In a similar way, we ﬁnd for (4.3.8) " # π 2 2 3 1/2 −3/2 2 5/2 Φ = −N εF µ + kT µ + ... , (4.3.14) 5 6 2 8

This series is P an asymptotic expansion in T . An asymptotic series for a function k I(λ), I(λ) = m k=0 ak λ + Rm (λ), is characterized by the following behavior of the remainder: limλ→0 Rm (λ)/λm = 0, limm→∞ Rm (λ) = ∞. For small values of λ, the function can be represented very accurately by a ﬁnite number of terms in the series. The fact that the integral in (4.3.10) for functions f (ε) ∼ ε1/2 etc. cannot be expanded in a Taylor series can be immediately recognized, since I diverges for T < 0.

4.3 The Nearly-degenerate Ideal Fermi Gas

181

from which, inserting (4.3.13),9

" 2 # 5π 2 kT 2 Φ = − N εF 1 + + O T4 5 12 εF

(4.3.14 )

or using P = −Φ/V , we obtain the equation of state. From (4.1.21), we ﬁnd immediately the internal energy

" 2 # 4 3 3 5π 2 kT E = P V = N εF 1 + . +O T 2 5 12 εF

(4.3.15)

From this, we calculate the heat capacity at constant V and N : CV = N k

π2 T , 2 TF

(4.3.16)

where we have introduced the Fermi temperature TF = εF /k .

(4.3.17)

At low temperatures, (T TF ), the heat capacity is a linear function of the temperature (Fig. 4.5). This behavior can be qualitatively understood in a simple way: if one increases the temperature from zero to T , the energy of a portion of the particles increases by kT . The number of particles which are excited in this manner is limited to a shell of thickness kT around the Fermi sphere, i.e. it is given by N kT /εF . All together, the energy increase is δE ∼ kT N

kT , εF

(4.3.16 )

from which, as in (4.3.16), we obtain CV ∼ kN T /TF . According to (4.3.14 ), the pressure is given by P =

2 5

6π 2 g

2/3

2 2m

N V

5/3 1+

5π 2 12

kT εF

2 + ... .

(4.3.14 )

Due to the Pauli exclusion principle, there is a pressure increase at T = 0 relative to a classical ideal gas, as can be seen in Fig. 4.6. The isothermal compressibility is then 1 κT = − V 9

∂V ∂P

T

2 π 2 kT 3(V /N ) 1− = + ... . 2εF 12 εF

(4.3.18)

If one requires the grand potential as a function of its natural variables, it is nec−3/2 = V g(2m)3/2 /6π 2 3 in (4.3.14). For the calculation essary to substitute N εF of CV and the equation of state, it is however expedient to employ T, V , and N as variables.

182

4. Ideal Quantum Gases

Fig. 4.5. The speciﬁc heat (heat capacity) of the ideal Fermi gas

Fig. 4.6. The pressure as a function of the temperature for the ideal Fermi gas (solid curve) and the ideal classical gas (dashed)

For the entropy, we ﬁnd for T TF S = kN

π2 T 2 TF

(4.3.19)

with T S = E + P V − µN from (4.3.15), (4.3.14 ) and (4.3.13) (cf. Appendix A.1, ‘Third Law’). The chemical potential of an ideal Fermi gas with a ﬁxed density can be found from Eq. (4.3.9) and is shown in Fig. 4.7 as a function of the temperature.

µ/ εF

1

0

-1

0.5

1.0

1.5

kT/ εF

Fig. 4.7. The chemical potential of the ideal Fermi gas at ﬁxed density as a function of the temperature.

Addenda: (i) The Fermi temperature, also known as the degeneracy temperature, „ «2/3 εF 1 N TF [K] = (4.3.20) = 3.85 × 10−38 k m[g] V [cm3 ] characterizes the thermodynamic behavior of fermions (see Table 4.1). For T TF , the system is nearly degenerate, while for T TF , the classical limit applies. Fermi energies are usually quoted in electron volts (eV). Conversion to Kelvins is ∧ 11605 K . accomplished using 1 eV=

4.3 The Nearly-degenerate Ideal Fermi Gas (ii) The density of states is deﬁned as Z Vg d3 p δ(ε − εp ) . ν(ε) = (2π)3

183

(4.3.21)

We note that ν(ε) is determined merely by the dispersion relation and not by statistics. The thermodynamic quantities do not depend on the details of the momentum dependence of the energy levels, but only on their distribution, i.e. on the density of states. Integrals over momentum space, whose integrands depend only on εp , can be rearranged as follows: Z Z Z Z (2π)3 dε ν(ε)f (ε) . d3 p f (ε)δ(ε − εp ) = d3 p f (εp ) = dε Vg For example, the particle number can be expressed in terms of the density of states in the form Z∞ N=

dε ν(ε)n(ε) .

(4.3.22)

−∞

For free electrons, we ﬁnd from (4.3.21) gV ν(ε) = 4π 2

„

2m 2

«3 2

ε1/2 =

3 ε1/2 . N 2 ε3/2

(4.3.23)

F

The dependence on ε1/2 shown in Fig. 4.8 is characteristic of nonrelativistic, noninteracting material particles.

Fig. 4.8. The density of states for free electrons in three dimensions The derivations of the speciﬁc heat and the compressibility given above can be generalized to the case of arbitrary densities of states ν(ε) by evaluating (4.3.9) and (4.3.8) in terms of a general ν(ε). The results are CV =

` ´ 1 2 π ν(εF )k2 T + O (T /TF )3 3

(4.3.24a)

κT =

` ´ V ν(εF ) + O (T /TF )2 . N2

(4.3.24b)

and

184

4. Ideal Quantum Gases

The fact that only the value of the density of states at the Fermi energy is of importance for the low-temperature behavior of the system was to be expected after the discussion following equation (4.3.17). For (4.3.23), we ﬁnd from (4.3.24a,b) once again the results (4.3.16) and (4.3.18). (iii) Degenerate Fermi liquids: physical examples of degenerate Fermi liquids are listed in Table 4.1. Table 4.1. Degenerate Fermi liquids: mass, density, Fermi temperature, Fermi energy

3

−27

0.91 × 10

He, P = 0–30 bar

Neutrons in the Nucleus Protons in the Nucleus Electrons in White Dwarf Stars

N/V [cm−3 ]

m[g]

Particles Metal electrons

24

10

5.01 × 10−24 (1.6–2.3)×1022 m∗/m = 2.8–5.5 0.11×1039 ´ ` × A−Z A

1.67 × 10−24 1.67 × 10−24

TF [K] 10

< 10

1.7–1.1

(1.5–0.9)×10−4

5.3×1011 ` ´ 23 × A−Z A

` ´2 3 46 A−Z × 106 A

Z 0.11 × 1039 A 5.3 × 1011

0.91 × 10−27

1030

εF [eV]

5

`Z ´2

3 × 109

3

A

46

`Z ´2 3

A

× 106

3 × 105

(iv) Coulomb interaction: electrons in metals are not free, but rather they repel each other as a result of their Coulomb interactions H=

p2 1 e2 i + . 2m 2 rij i

(4.3.25)

i =j

The following scaling of the Hamiltonian shows that the approximation of free electrons is particularly reasonable for high densities. To see this, we carry out the canonical transformation r = r/r0 , p = p r0 . The characteristic 3V 1/3 3 . In terms of these new length r0 is deﬁned by 4π 3 r0 N = V , i.e. r0 = 4πN variables, the Hamiltonian is

2 1 e2 1 pi H= 2 + r0 . r0 2m 2 rij i

(4.3.25 )

i =j

The Coulomb interaction becomes less and less important relative to the kinetic energy the smaller r0 , i.e. the more dense the gas becomes.

4.3 The Nearly-degenerate Ideal Fermi Gas ∗

185

4.3.3 Real Fermions

In this section, we will consider real fermionic many-body systems: the conduction electrons in metals, liquid 3 He, protons and neutrons in atomic nuclei, electrons in white dwarf stars, neutrons in neutron stars. All of these fermions interact; however, one can understand many of their properties without taking their interactions into account. In the following, we will deal with the parameters mass, Fermi energy. and temperature and discuss the modiﬁcations which must be made as a result of the interactions (see also Table 4.1). a) The Electron Gas in Solids The alkali metals Li, Na, K, Rb, and Cs are monovalent (with a body-centered cubic crystal structure); e.g. Na has a single 3s1 electron (Table 4.2). The noble metals (face-centered cubic crystal structure) are Copper Cu 4s1 3d10 Silver Ag 5s1 4d10 Gold Au 6s1 5d10 . All of these elements have one valence electron per atom, which becomes a conduction electron in the metal. The number of these quasi-free electrons is equal to the number of atoms. The energy-momentum relation is to a good p2 10 approximation parabolic, εp = 2m . Table 4.2. Electrons in Metals; Element, Density, Fermi Energy, Fermi Temperature, γ/γtheor. , Eﬀective Mass N/V [cm−3 ]

εF [eV]

TF [K]

γ/γtheor.

m∗ /m

Li Na K Rb Cs

4.6 × 1022 2.5 1.34 1.08 0.86

4.7 3.1 2.1 1.8 1.5

5.5 × 104 3.7 2.4 2.1 1.8

2.17 1.21 1.23 1.22 1.35

2.3 1.3 1.2 1.3 1.5

Cu Ag Au

8.5 5.76 5.9

7 5.5 5.5

8.2 6.4 6.4

1.39 1.00 1.13

1.3 1.1 1.1

10

Remark concerning solid-state physics applications: for Na, we have 4π ( pF )3 = 3 4π 3 N 1 = 2 VBrill. , where VBrill. is the volume of the ﬁrst Brillouin zone. The Fermi V sphere always lies within the Brillouin zone and thus never crosses the zone boundary, where there are energy gaps and deformations of the Fermi surface. The Fermi surface is therefore in practice spherical, ∆pF /pF ≈ 10−3 . Even in copper, where the 4s Fermi surface intersects the Brillouin zone of the fcc lattice, the Fermi surface remains in most regions spherical to a good approximation.

186

4. Ideal Quantum Gases

Fig. 4.9. The experimental determination of γ from the speciﬁc heat of gold (D. L. Martin, Phys. Rev. 141, 576 (1966); ibid. 170, 650 (1968))

Taking account of the electron-electron interactions requires many-body methods, which are not at our disposal here. The interaction of two electrons is weakened by screening from the other electrons; in this sense, it is understandable that the interactions can be neglected to a ﬁrst approximation in treating many phenomena (e.g. Pauli paramagnetism; but not ferromagnetism). The total speciﬁc heat of a metal is composed of a contribution from the electrons (Fig. 4.9) and from the phonons (lattice vibrations, see Sect 4.6): CV = γT + DT 3 . N CV 2 2 Plotting N T = γ + DT vs. T , we can read γ oﬀ the ordinate. From (4.3.16), 2 2 the theoretical value of γ is γtheor = π2εkF . The deviations between theory and experiment can be attributed to the fact that the electrons move in the potential of the ions in the crystal and are subject to the inﬂuence of the electron-electron interaction. The potential and the electron-electron interaction lead among other things to an eﬀective mass m∗ for the electrons, p2 i.e. the dispersion relation is approximately given by εp = 2m ∗ . This eﬀective mass can be larger or smaller than the mass of free electrons.

b) The Fermi Liquid 3 He 3 He has a nuclear spin of I = 12 , a mass m = 5.01 × 10−24g, a particle density of n = 1.6 × 1022 cm−3 at P = 0, and a mass density of 0.081 g cm−3. It follows that εF = 4.2 × 10−4 eV and TF = 4.9 K. The interactions of the 3 He

4.3 The Nearly-degenerate Ideal Fermi Gas

187

Fig. 4.10. The phase diagram of 3 He

atoms lead to an eﬀective mass which at the pressures P = 0 and P = 30 bar is given by m∗ = 2.8 m and m∗ = 5.5 m. Hence the Fermi temperature for P = 30, TF ≈ 1 K, is reduced relative to a ﬁctitious non-interacting 3 He gas. The particle densities at these pressures are n = 1.6 × 1023 cm−3 and n = 2.3 × 1022 cm−3 . The interaction between the helium atoms is shortranged, in contrast to the electron-electron interaction. The small mass of the helium atoms leads to large zero-point oscillations; for this reason, 3 He, like 4 He, remains a liquid at pressures below ∼ 30 bar, even at T → 0. 3 He and 4 He are termed quantum liquids. At 10−3 K, a phase transition into the superﬂuid state takes place (l = 1, s = 1) with formation of BCS pairs.11 In the superconductivity of metals, the Cooper pairs formed by the electrons have l = 0 and s = 0. The relatively complex phase diagram of 3 He is shown in Fig. 4.10.11 c) Nuclear Matter A further example of many-body systems containing fermions are the neutrons and protons in the nucleus, which both have masses of about m = 1.67 × 10−24 g. The nuclear radius depends on the nucleon number A via 4π 3 3 R = 1.3 × 10−13 A1/3 cm. The nuclear volume is V = 4π 3 R = 3 (1.3) × −39 3 −39 3 10 A cm = 9.2 × 10 A cm . A is the overall number of nucleons and Z the number of protons in the nucleus. Nuclear matter12 occurs not only within large atomic nuclei, but also in neutron stars, where however also the gravitational interactions must be taken into account.

11

12

D. Vollhardt and P. W¨ olﬂe, The Superﬂuid Phases of Helium 3, Taylor & Francis, London, 1990 A. L. Fetter and J. D. Walecka, Quantum Theory of Many-Particle Systems, McGraw-Hill, New York 1971

188

4. Ideal Quantum Gases

d) White Dwarfs The properties of the (nearly) free electron gas are indeed of fundamental importance for the stability of the white dwarfs which can occur at the ﬁnal stages of stellar evolution.13 The ﬁrst such white dwarf to be identiﬁed, Sirius B, was predicted by Bessel as a companion of Sirius. Mass ≈ M = 1.99 × 1033 g Radius 0.01R , R = 7 × 1010 cm Density ≈ 107 ρ = 107 g/cm3 , ρ = 1g/cm3 ρSirius B ≈ 0.69 × 105 g/cm3 Central temperature ≈ 107 K ≈ T White dwarfs consist of ionized nuclei and free electrons. Helium can still be burned in white dwarfs. The Fermi temperature is TF ≈ 3 · 109 K, so that the electron gas is highly degenerate. The high zero-point pressure of the electron gas opposes the gravitational attraction of the nuclei which compresses the star. The electrons can in fact be regarded as free; their Coulomb repulsion is negligible at these high pressures. ∗

e) The Landau Theory of Fermi Liquids

The characteristic temperature dependences found for ideal Fermi gases at low temperatures remain in eﬀect in the presence of interactions. This is the result of Landau’s Fermi liquid theory, which is based on physical arguments that can also be justiﬁed in terms of microscopic quantum-mechanical manybody theory. We give only a sketch of this theory, including its essential results, and refer the reader to more detailed literature14 . One ﬁrst considers 13

14

An often-used classiﬁcation of the stars in astronomy is based on their positions in the Hertzsprung–Russell diagram, in which their magnitudes are plotted against their colors (equivalent to their surface temperatures). Most stars lie on the so called main sequence. These stars have masses ranging from about one tenth of the Sun’s mass up to a sixty-fold solar mass in the evolutionary stages in which hydrogen is converted to helium by nuclear fusion (‘burning’). During about 90% of their evolution, the stars stay on the main sequence – as long as nuclear fusion and gravitational attraction are in balance. When the fusion processes come to an end as their ‘fuel’ is exhausted, gravitational forces become predominant. In their further evolution, the stars become red giants and ﬁnally contract to one of the following end stages: in stars with less than 1.4 solar masses, the compression process is brought to a halt by the increase of the Fermi energy of the electrons, and a white dwarf is formed, consisting mainly of helium and electrons. Stars with two- or threefold solar masses end their contraction after passing through intermediate phases as neutron stars. Above three or four solar masses, the Fermi energy of the neutrons is no longer able to stop the compression process, and a black hole results. A detailed description of Landau’s Fermi liquid theory can be found in D. Pines and P. Nozi`eres, The Theory of Quantum Liquids, W. A. Benjamin, New York 1966, as well as in J. Wilks, The Properties of Liquid and Solid Helium, Clarendon Press, Oxford, 1967. See also J. Wilks and D. S. Betts, An Introduction to Liquid Helium, Oxford University Press, 2nd ed., Oxford, (1987).

4.3 The Nearly-degenerate Ideal Fermi Gas

189

the ground state of the ideal Fermi gas, and the ground state with an additional particle (of momentum p); then the interaction is ‘switched on’. The ideal ground state becomes a modiﬁed ground state and the state with the additional particle becomes the modiﬁed ground state plus an excited quantum (a quasiparticle of momentum p). The energy of the quantum, ε(p), is shifted relative to ε0 (p) ≡ p2 /2m. Since every non-interacting single-particle state is only singly occupied, there are also no multiply-occupied quasiparticle states; i.e. the quasiparticles also obey Fermi–Dirac statistics. When several quasiparticles are excited, their energy also depends upon the number δn(p) of the other excitations ε(p) = ε0 (p) + F (p, p )δn(p ) . (4.3.26) p

The average occupation number takes a similar form to that of ideal fermions, owing to the fermionic character of the quasiparticles: np =

1 e(ε(p)−µ)/kT

+1

,

(4.3.27)

where, according to (4.3.26), ε(p) itself depends on the occupation number. This relation is usually derived in the present context by maximizing the entropy expression found in problem 4.2, which can be obtained from purely combinatorial considerations. At low temperatures, the quasiparticles are excited only near the Fermi energy, and due to the occupied states and energy conservation, the phase space for scattering processes is severely limited. Although the interactions are by no means necessarily weak, the scattering rate vanishes with temperature as τ1 ∼ T 2 , i.e. the quasiparticles are practically stable particles. The interaction between the quasiparticles can be written in the form F (p, σ; p , σ ) = f s (p, p ) + σ · σ f a (p, p )

(4.3.28a)

with the Pauli spin matrices σ. Since only momenta in the neighborhood of the Fermi momentum contribute, we introduce f s,a (p, p ) = f s,a (χ)

(4.3.28b)

and F s,a (χ) = ν(εF )f s,a (χ) =

V m∗ pF s,a f (χ) , π 2 3

(4.3.28c)

where χ is the angle between p and p and ν(εF ) is the density of states. A series expansion in terms of Legendre polynomials leads to s,a F s,a (χ) = Fl Pl (cos χ) = 1 + F1s,a cos χ + . . . . (4.3.28d) l

The Fls and Fla are the spin-symmetric and spin-antisymmetric Landau parameters; the Fla result from the exchange interaction.

190

4. Ideal Quantum Gases

Due to the Fermi character of the quasiparticles, which at low temperatures can be excited only near the Fermi energy, it is clear from the qualitative estimate (4.3.16 ) that the speciﬁc heat of the Fermi liquid will also have a linear temperature dependence. In detail, one obtains for the speciﬁc heat, the compressibility, and the magnetic susceptibility: 1 2 π ν(εF ) k 2 T , 3 V ν(εF ) κT = 2 , N 1 + F0s ν(εF )N χ = µ2B , 1 + F0a

CV =

with the density of states ν(εF ) =

(4.3.29a) (4.3.29b) (4.3.29c) V m∗ pF π 2 3

and the eﬀective mass ratio

1 m∗ = 1 + F1s . m 3

(4.3.29d)

The structure of the results is the same as for ideal fermions.

4.4 The Bose–Einstein Condensation In this section, we investigate the low-temperature behavior of a nonrelativistic ideal Bose gas of spin s = 0, i.e. g = 1 and εp =

p2 . 2m

(4.4.1)

In their ground state, non-interacting bosons all occupy the energetically lowest single-particle state; their low-temperature behavior is therefore quite diﬀerent from that of fermions. Between the high-temperature phase, where the bosons are distributed over the whole spectrum of momentum values, corresponding to the Bose distribution function, and the phase in which the (p = 0) state is macroscopically occupied (at T = 0, all the particles are in this state), a phase transition takes place. This so called Bose–Einstein condensation of an ideal Bose gas was predicted by Einstein15 on the basis of the statistical considerations of Bose, nearly seventy years before it was observed experimentally. We ﬁrst refer to the results of Sect 4.1, where we found for the particle density, i.e. for the reciprocal of the speciﬁc volume, in Eq. (4.1.17): λ3 = g3/2 (z) v 15

(4.4.2a)

A. Einstein, Sitzber. Kgl. Preuss. Akad. Wiss. 1924, 261, (1924), ibid. 1925, 3 (1925); S. Bose, Z. Phys. 26, 178 (1924)

4.4 The Bose–Einstein Condensation

191

with λ = 2π/mkT and, using (4.2.1), 2 g3/2 (z) = √ π

∞

∞

dx

zk x1/2 = . x −1 e z −1 k 3/2

(4.4.2b)

k=1

0

According to Remark (i) in Sect. 4.1, the fugacity of bosons z = eµ/kT is limited to z ≤ 1. The maximum value of the function g3/2 (z), which is shown in Fig. 4.11, is then given by g3/2 (1) = ζ(3/2) = 2.612.

Fig. 4.11. The function g3/2 (z).

Fig. 4.12. The fugacity z as a function of v/λ3

In the following, we take the particle number and the volume, and thus the speciﬁc volume v, to be ﬁxed at given values. Then from Eq. (4.4.2a), we can calculate z as a function of T , or, more expediently, of vλ−3 . On lowering the 1 temperature, λv3 decreases and z therefore increases, until ﬁnally at λv3 = 2.612 it reaches its maximum value z = 1 (Fig. 4.12). This deﬁnes a characteristic temperature kTc (v) =

2π2 /m . (2.612 v)2/3

(4.4.3)

When z approaches 1, we must be more careful in taking the limit of p → 3 d p used in (4.1.14a) and (4.1.15). This is also indicated by the fact that (4.4.2a) would imply for z = 1 that at temperatures below Tc (v), the density 1 v must decrease with decreasing temperature. From (4.4.2a), there would appear to no longer be enough space for all the particles. Clearly, we have to treat the (p = 0) term in the sum in (4.1.8), which diverges for z → 1, separately: 1 V 1 N = −1 + + d3 p n(εp ) . n(εp ) = −1 z −1 z − 1 (2π)3 p =0

The p = 0 state for fermions did not require any special treatment, since the average occupation numbers can have at most the value 1. Even for bosons, this modiﬁcation is important only at T < Tc (v) and leads at T = 0 to the complete occupation of the p = 0 state, in agreement with the ground state which we described above.

192

4. Ideal Quantum Gases

We thus obtain for bosons, instead of (4.4.2a): v 1 + N 3 g3/2 (z) , z −1 − 1 λ

N=

(4.4.4)

or, using Eq. (4.4.3), 1 +N N = −1 z −1

T Tc (v)

3/2

g3/2 (z) . g3/2 (1)

(4.4.4 )

The overall particle number N is thus the sum of the number of particles in the ground state N0 =

1 z −1 − 1

(4.4.5a)

and the numbers in the excited states 3/2

g3/2 (z) T . N =N Tc (v) g3/2 (1)

(4.4.5b)

For T > Tc (v), Eq. (4.4.4 ) yields a value for z of z < 1. The ﬁrst term on the right-hand side of (4.4.4 ) is therefore ﬁnite and can be neglected relative to N . Our initial considerations thus hold here; in particular, z follows from 3/2

Tc (v) g3/2 (z) = 2.612 for T > Tc (v) . (4.4.5c) T For T < Tc (v), from Eq. (4.4.4 ), z = 1 − O(1/N ), so that all of the particles which are no longer in excited states can ﬁnd suﬃcient ‘space’ to enter the ground state. When z is so close to 1, we can set z = 1 in the second term and obtain ) 3/2 *

T N0 = N 1 − . Tc (v) Deﬁning the condensate fraction in the thermodynamic limit by ν0 = lim

N →∞

v ﬁxed

N0 , N

we ﬁnd in summary 0 3/2 ν0 = 1 − TcT(v)

(4.4.6)

T > Tc (v) T < Tc (v) .

(4.4.7)

This phenomenon is called the Bose–Einstein condensation. Below Tc (v), the ground state p = 0 is macroscopically occupied. The temperature depen√ √ dence of ν0 and ν0 is shown in Fig. 4.13. The quantities ν0 and ν0 are

4.4 The Bose–Einstein Condensation

Fig. 4.13. The relative number of particles in the condensate and its square root as functions of the temperature

193

Fig. 4.14. The transition temperature as a function of the speciﬁc volume

characteristic of the condensation or the ordering of the system. For reasons √ which will become clear later, one refers to ν0 as the order parameter. In √ the neighborhood of Tc , ν0 goes to zero as √ ν0 ∝ Tc − T . (4.4.7 ) In Fig. 4.14, we show the transition temperature as a function of the speciﬁc volume. The higher the density (i.e. the smaller the speciﬁc volume), the higher the transition temperature Tc (v) at which the Bose–Einstein condensation takes place. P Remark: One might ask whether the next higher terms in the sum p n(εp ) could not also be macroscopically occupied. The following estimate ` however ´ shows that n(εp ) n(0) for p = 0. Consider e.g. the momentum p = 2π , 0, 0 , for which L 1 2m 1 1 1 < < ∼ O(V −1/3 ) V eβp21 /2m z −1 − 1 V eβp21 /2m − 1 V βp21 holds, while

1 1 V z −1 −1

∼ O(1) .

There is no change in the grand potential compared to the integral representation (4.1.19 ), since for the term with p = 0 in the thermodynamic limit, it follows that 1 1 1 lim log(1 − z(V )) = lim log = 0 . V →∞ V V →∞ V V Therefore, the pressure is given by (4.1.19 ) as before, where z for T > Tc (v) follows from (4.4.5c), and for T < Tc (v) it is given by z = 1. Thus ﬁnally the pressure of the ideal Bose gas is ⎧ kT ⎪ ⎪ T > Tc ⎪ ⎨ λ3 g5/2 (z) P = , (4.4.8) ⎪ ⎪ kT ⎪ ⎩ 1.342 T < Tc λ3

194

4. Ideal Quantum Gases

Fig. 4.15. The functions g3/2 (z) and g5/2 (z). In the limit z → 0, the functions become asymptotically identical, g3/2 (z) ≈ g5/2 (z) ≈ z.

Fig. 4.16. The equation of state of the ideal Bose gas. The isochores are shown for decreasing values of v. For T < Tc (v), the pressure is P = kT 1.342. λ3

with g5/2 (1) = ζ 52 = 1.342. If we insert z from (4.4.4) here, we obtain the equation of state. For T > Tc , using (4.4.5c), we can write (4.4.8) in the form P =

kT g5/2 (z) . v g3/2 (z)

(4.4.9)

The functions g5/2 (z) and g3/2 (z) are drawn in Fig. 4.15. The shape of the equation of state can be qualitatively seen from them. For small values of z, g5/2 (z) ≈ g3/2 (z), so that for large v and high T , we obtain again from (4.4.9) the classical equation of state (see Fig. 4.16). On approaching Tc (v), it becomes increasingly noticeable that g5/2 (z) < g3/2 (z). At Tc (v), the isochores converge into the curve P = kT λ3 1.342, which represents the pressure for T < Tc (v). All together, this leads to the equation of state corresponding to the isochores in Fig. 4.16. For the entropy, we ﬁnd16

⎧ 5 v ⎪ ⎪ N k g (z) − log z T > Tc 5/2 ⎪ ⎪

2 λ3 ⎨ ∂P V S= = , (4.4.10)

3/2 ∂T V,µ ⎪ ⎪ (1) g 5 T ⎪ 5/2 ⎪ ⎩ Nk T < Tc 2 g3/2 (1) Tc 16

Note that

d g (z) dz ν

= z1 gν−1 (z).

4.4 The Bose–Einstein Condensation

195

Fig. 4.17. The heat capacity = N × the speciﬁc heat of an ideal Bose gas

2

µ/kTc(υ)

1 0 -1

1

2

3

T/Tc

-2 -3

Fig. 4.18. The chemical potential of the ideal Bose gas at a ﬁxed density as a function of the temperature

and, after some calculation, we obtain for the heat capacity at constant volume ⎧ 15 v 9 g3/2 (z) ⎪ ⎪ ⎪ T > Tc g5/2 (z) − ⎪ 3 ⎪

4 g1/2 (z) ⎨ 4 λ ∂S CV = T = Nk . (4.4.11)

3/2 ⎪ ∂T N,V ⎪ (1) g ⎪ 15 T 5/2 ⎪ ⎪ T < Tc . ⎩ 4 g3/2 (1) Tc The entropy and the speciﬁc heat vary as T 3/2 at low T . Only the excited states contribute to the entropy and the internal energy; the entropy of the condensate is zero. At T = Tc , the speciﬁc heat of the ideal Bose gas has a cusp (Fig. 4.17). From Eq. (4.4.4) or from Fig. 4.12, one can obtain the chemical potential, shown in Fig. 4.18 as a function of the temperature. At Tλ = 2.18 K, the so called lambda point, 4 He exhibits a phase transition into the superﬂuid state (see Fig. 4.19). If we could neglect the interactions of the helium atoms, the temperature of a Bose–Einstein condensation would be Tc (v) = 3.14 K, using the speciﬁc volume of helium in (4.4.3). The interactions are however very important, and it would be incorrect to identify the phase transition into the superﬂuid state with the Bose–Einstein

196

4. Ideal Quantum Gases

Fig. 4.19. The phase diagram of 4 He (schematic). Below 2.18 K, a phase transition from the normal liquid He I phase into the superﬂuid He II phase takes place

Fig. 4.20. The experimental speciﬁc heat of 4 He, showing the characteristic lambda anomaly

condensation treated above. The superﬂuid state in three-dimensional helium is indeed also created by a condensation (macroscopic occupation) of the p = 0 state, but at T = 0, the fraction of condensate is only 8%. The speciﬁc heat (Fig. 4.20) exhibits a λ anomaly (which gives the transition its name), i.e. an approximately logarithmic singularity. The typical excitation spectrum and the hydrodynamic behavior as described by the two-ﬂuid model are compatible only with an interacting Bose system (Sect. 4.7.1). Another Bose gas, which is more ideal than helium and in which one can likewise expect a Bose–Einstein condensation – which has been extensively searched for experimentally – is atomic hydrogen in a strong magnetic ﬁeld (the spin polarization of the hydrogen electrons prevents recombination to molecular H2 ). Because of the diﬃculty of suppressing recombination of H to H2 , over a period of many years it however proved impossible to prepare atomic hydrogen at a suﬃcient density. The development of atom traps has recently permitted remarkable progress in this area. The Bose–Einstein condensation was ﬁrst observed, 70 years after its original prediction, in a gas consisting of around 2000 spin-polarized 87 Rb atoms, which were enclosed in a quadrupole trap.17,18 The transition temperature is at 170 × 10−9 K. One might at ﬁrst raise the objection that at low temperatures the alkali atoms should form a solid; however, a metastable gaseous state can be maintained within the trap even at temperatures in the nanokelvin range. In the initial experiments, the condensed state could be kept for about ten seconds. Similar results were obtained with a gas consisting of 2 × 105

17

18

M. H. Anderson, J. R. Ensher, M. R. Matthews, C. E. Wieman, and E. A. Cornell, Science 269, 198 (1995) See also G. P. Collins, Physics Today, August 1995, 17.

4.5 The Photon Gas

197

spin-polarized 7 Li atoms.19 In this case, the condensation temperature is Tc ≈ 400 × 10−9 K. In 87 Rb, the s-wave scattering length is positive, while in 7 Li, it is negative. However, even in 7 Li, the gas phase does not collapse into a condensed phase, in any case not within the spatially inhomogeneous atom trap.19 Finally, it also proved possible to produce and maintain a condensate containing more than 108 atoms of atomic hydrogen, with a transition temperature of about 50 µK, for up to 5 seconds.20

4.5 The Photon Gas 4.5.1 Properties of Photons We next want to determine the thermal properties of the radiation ﬁeld. To start with, we list some of the characteristic properties of photons. (i) Photons obey the dispersion relation εp = c|p| = ck and are bosons with a spin s = 1. Since they are completely relativistic particles (m = 0, v = c), their spins have only two possible orientations, i.e. parallel or antiparallel to p, corresponding to right-hand or left-hand circularly polarized light (0 and π are the only angles which are Lorentz invariant). The degeneracy factor for photons is therefore g = 2. (ii) The mutual interactions of photons are practically zero, as one can see from the following argument: to lowest order, the interaction consists of the scattering of two photons γ1 and γ2 into the ﬁnal states γ3 and γ4 ; see Fig. 4.21a. In this process, for example photon γ1 decays into a virtual electron-positron pair, photon γ2 is absorbed by the positron, the electron emits photon γ3 and recombines with the positron to give photon γ4 . The scattering cross-section for this process is extremely small, of order σ ≈ 10−50 cm2 . The mean collision time can be calculated from the scattering cross-section as follows: in the time ∆t, a photon traverses the distance c∆t. We thus consider the cylinder shown in Fig. 4.21b, whose basal area is equal to the scattering cross-section and whose length is the velocity of light × ∆t. A photon interacts within the time ∆t with all other photons which are in the volume c σ ∆t, roughly speaking. Let N be the total number of photons within the volume V (which depends on the temperature and which we still have to determine; see the end of Sect. 4.5.4). Then a photon interacts with c σ N/V particles per unit time. Thus the mean collision time (time between two collisions on average) τ is determined by τ= 19

20

(V /N ) sec V = 1040 3 . cσ cm N

C. C. Bradley, C. A. Sackett, J. J. Tollett, and R. G. Hulet, Phys. Rev. Lett. 75, 1687 (1995) D. Kleppner, Th. Greytak et al., Phys. Rev. Lett. 81, 3811 (1998)

198

4. Ideal Quantum Gases

Fig. 4.21. (a) Photon-photon scattering (dashed lines: photons; solid lines: electron and positron). (b) Scattering cross-section and mean collision time

The value of the mean collision time is approximately τ ≈ 1031 sec at room temperature and τ ≈ 1018 sec at the temperature of the Sun’s interior (107 K). Even at the temperature in the center of the Sun, the interaction of the photons it negligible. In comparison, the age of the Universe is ∼ 1017 sec. Photons do indeed constitute an ideal quantum gas. The interaction with the surrounding matter is crucial in order to establish equilibrium within the radiation ﬁeld. The establishment of equilibrium in the photon gas is brought about by absorption and emission of photons by matter. In the following, we will investigate the radiation ﬁeld within a cavity of volume V and temperature T , and without loss of generality of our considerations, we take the quantization volume to be cubical in shape (the shape is irrelevant for short wavelengths, and the long waves have a low statistical weight). (iii) The number of photons is not conserved. Photons are emitted and absorbed by the material of the cavity walls. From the quantum-ﬁeld description of photons it follows that each wavenumber and polarization direction corresponds to a harmonic oscillator. The Hamiltonian thus has the form H= εp n ˆ p,λ ≡ εp a†p,λ ap,λ , p = 0 , (4.5.1) p,λ

p,λ

where n ˆ p,λ = a†p,λ ap,λ is the occupation number operator for the momen-

tum p and the direction of polarization λ; also, a†p,λ , ap,λ are the creation and annihilation operators for a photon in the state p, λ. We note that in the Hamiltonian of the radiation ﬁeld, there is no zero-point energy, which is automatically accomplished in quantum ﬁeld theory by deﬁning the Hamiltonian in terms of normal-ordered products.21 21

C. Itzykson, J.-B. Zuber, Quantum Field Theory, McGraw-Hill; see also QM II.

4.5 The Photon Gas

199

4.5.2 The Canonical Partition Function The canonical partition function is given by (np,λ = 0, 1, 2, . . .): Z = Tr e−βH =

⎡ e−β

P p

εp np,λ

=⎣

p =0

{np,λ }

⎤2 1 ⎦ . 1 − e−βεp

(4.5.2)

Here, there is no condition on the number of photons, since it is not ﬁxed. In (4.5.2), the power 2 enters due to the two possible polarizations λ. With this expression, we ﬁnd for the free energy F (T, V ) = −kT log Z = 2kT

log 1 − e−εp /kT

p =0

=

2V β

d3 p V (kT )4 log(1 − e−βεp ) = 2 3 (2π) π (c)3

∞

dx x2 log(1 − e−x ) .

0

(4.5.3) The sum has been converted to an integral according to (4.1.14a). For the integral in (4.5.3), we ﬁnd after integration by parts ∞ dx x log(1 − e 2

−x

1 )=− 3

0

∞ 0

∞ dx x3 1 π4 = −2 , ≡ −2ζ(4) = − ex − 1 n4 45 n=1

where ζ(n) is Riemann’s ζ-function (Eqs. (D.2) and (D.3)), so that for F , we have ﬁnally F (T, V ) = −

V (kT )4 π 2 4σ = − V T4 (c)3 45 3c

(4.5.4)

with the Stefan–Boltzmann constant σ≡

π2 k4 = 5.67 × 10−8 J sec−1 m−2 K−4 . 603 c2

From (4.5.4), we obtain the entropy:

∂F 16σ V T3 , S=− = ∂T V 3c

(4.5.5)

(4.5.6a)

the internal energy (caloric equation of state) E = F + TS =

4σ V T4 , c

and the pressure (thermal equation of state)

(4.5.6b)

200

4. Ideal Quantum Gases

P =−

∂F ∂V

= T

4σ 4 T , 3c

(4.5.6c)

and ﬁnally the heat capacity

∂S 16σ CV = T V T3 . = ∂T V c

(4.5.7)

Because of the relativistic dispersion, for photons E = 3P V holds instead of 32 P V . Eq. (4.5.6b) is called the Stefan–Boltzmann law: the internal energy of the radiation ﬁeld increases as the fourth power of the temperature. The radiation pressure (4.5.6c) is very low, except at extremely high temperatures. At 105 K, the temperature produced by the a nuclear explosion, it is P = 0.25 bar, and at 107 K, the Sun’s central temperature, it is P = 25 × 106 bar. 4.5.3 Planck’s Radiation Law We now wish to discuss some of the characteristics of the radiation ﬁeld. The average occupation number of the state (p, λ) is given by np,λ =

1 eεp /kT

(4.5.8a)

−1

with εp = ωp = cp, since ∞

ˆ p,λ Tr e−βH n np,λ ≡ = Tr e−βH

np,λ =0 ∞

np,λ e−np,λ εp /kT e−np,λ εp /kT

np,λ =0

can be evaluated analogously to Eq. (4.1.9). The average occupation number (4.5.8a) corresponds to that of atomic or molecular free bosons, Eq. (4.1.9), with µ = 0. The number of occupied states in a diﬀerential element d3 p within a ﬁxed volume is therefore (see (4.1.14a)): np,λ

2V d3 p , (2π)3

(4.5.8b)

and in the interval [p, p + dp], it is np,λ

V p2 dp . π 2 3

(4.5.8c)

4.5 The Photon Gas

201

It follows from this that the number of occupied states in the interval [ω, ω + dω] is equal to V ω 2 dω . 2 3 ω/kT π c e −1

(4.5.8d)

The spectral energy density u(ω) is deﬁned as the energy per unit volume and frequency, i.e. as the product of (4.5.8d) with ω/V : u(ω) =

ω3 . π 2 c3 eω/kT − 1

(4.5.9)

This is the famous Planck radiation law (1900), which initiated the development of quantum mechanics. We now want to discuss these results in detail. The occupation number (4.5.8a) for photons diverges for p → 0 as 1/p (see Fig. 4.22), since the energy of the photons goes to zero when p → 0. Because the density of states in three dimensions is proportional to ω 2 , this divergence is irrelevant to the energy content of the radiation ﬁeld. The spectral energy density is shown in Fig 4.22.

Fig. 4.22. The photon number as a function of ω/kT (dot-dashed curve). The spectral energy density as a function of ω/kT (solid curve).

As a function of ω, it shows a maximum at ωmax = 2.82 kT ,

(4.5.10)

i.e. around three times the thermal energy. The maximum shifts proportionally to the temperature. Equation (4.5.10), Wien’s displacement law (1893), played an important role in the historical development of the theory of the radiation ﬁeld, leading to the discovery of Planck’s quantum of action. In Fig. 4.23, we show u(ω, T ) for diﬀerent temperatures T .

202

4. Ideal Quantum Gases

Fig. 4.23. Planck’s law for three temperatures, T1 < T2 < T3

We now consider the limiting cases of Planck’s radiation law: (i) ω kT : for low frequencies, we ﬁnd using (4.5.9) that u(ω) =

kT ω 2 ; π 2 c3

(4.5.11)

the Rayleigh–Jeans radiation law. This is the classical low-energy limit. This result of classical physics represented one of the principal problems in the theory of the radiation ﬁeld. Aside from the fact that it agreed with experiment only for very low frequencies, it was also fundamentally unacceptable: for according to (4.5.11), in the high-frequency limit ω → ∞, it leads to a divergence in u(ω), the so called ultraviolet catastrophe.This would in turn ∞ imply an inﬁnite energy content of the cavity radiation, 0 dω u(ω) = ∞. (ii) ω kT : In the high-frequency limit, we ﬁnd from (4.5.9) that u(ω) =

ω 3 −ω/kT e . π 2 c3

(4.5.12)

The energy density decreases exponentially with increasing frequency. This empirically derived relation is known as Wien’s law. In his ﬁrst derivation, Planck farsightedly obtained (4.5.9) by interpolating the corresponding entropies between equations (4.5.11) and (4.5.12). Often, the energy density is expressed in terms of the wavelength λ: starting , we obtain dω = − 2πc dλ. Therefore, the energy per unit volume from ω = ck = 2πc λ λ2 in the interval [λ, λ + dλ] is given by „ «˛ ˛ 16π 2 c dλ 2πc ˛˛ dω ˛˛ dEλ ” , “ 2πc dλ = (4.5.13) =u ω= ˛ dλ ˛ V λ λ5 e kT λ − 1 where we have inserted (4.5.9). The energy density as a function of the wavelength dEλ has its maximum at the value λmax , determined by dλ 2πc = 4.965 . kT λmax

(4.5.14)

4.5 The Photon Gas

203

We will now calculate the radiation which emerges from an opening in the cavity at the temperature T . To do this, we ﬁrst note that the radiation within the cavity is completely isotropic. The emitted thermal radiation at a frequency ω into a solid angle dΩ is therefore u(ω) dΩ 4π . The radiation energy which emerges per unit time onto a unit surface is

I(ω, T ) =

1 4π

dΩ c u(ω) cos ϑ =

1 4π

2π

1 dϕ

0

dη η c u(ω) =

c u(ω) . (4.5.15) 4

0

The integration over the solid angle dΩ extends over only one hemisphere (see Fig. 4.24). The total radiated power per unit surface (the energy ﬂux) is then IE (T ) = dω I(ω, T ) = σT 4 , (4.5.16) where again the Stefan–Boltzmann constant σ from Eq. (4.5.5) enters the expression.

Fig. 4.24. The radiation emission per unit surface area from a cavity radiator (black body)

A body which completely absorbs all the radiation falling upon it is called a black body. A small opening in the wall of a cavity whose walls are good absorbers is the ideal realization of a black body. The emission from such an opening calculated above is thus the radiation emitted by a black body. As an approximation, Eqns. (4.5.15,16) are also used to describe the radiation from celestial bodies. Remark: The Universe is pervaded by the so called cosmic background radiation discovered by Penzias and Wilson, which corresponds according to Planck’s law to a temperature of 2.73 K. It is a remainder from the earliest times of the Universe, around 300,000 years after the Big Bang, when the temperature of the cosmos had

204

4. Ideal Quantum Gases

already cooled to about 3000 K. Previous to this time, the radiation was in thermal equilibrium with the matter. At temperatures of 3000 K and below, the electrons bond to atomic nuclei to form atoms, so that the cosmos became transparent to this radiation and it was practically decoupled from the matter in the Universe. The expansion of the Universe by a factor of about one thousand then led to a corresponding increase of all wavelengths due to the red shift, and thus to a Planck distribution at an eﬀective temperature of 2.73 K. ∗

4.5.4 Supplemental Remarks

Let us now interpret the properties of the photon gas in a physical sense and compare it with other gases. The mean photon number is given by N =2

p

1 ecp/kT 3

V (kT ) = 2 3 3 π c

∞ 0

V = 2 3 π c −1

∞ 0

dω ω 2 eω/kT − 1

2

dx x 2ζ(3) = V ex − 1 π2

where the value p = 0 is excluded in 3

kT . N = 0.244 V c

kT c

p.

3 ,

Inserting ζ(3), we obtain (4.5.17)

Combining this with (4.5.6c) and (4.5.6a) and inserting approximate numerical values shows a formal similarity to the classical ideal gas: P V = 0.9 N kT

(4.5.18)

S = 3.6 N k ,

(4.5.19)

where N is however always given by (4.5.17) and does not have a ﬁxed value. The pressure per particle is of about the same order of magnitude as in the classical ideal gas. The thermal wavelength of the photon gas is found to be λT =

0.510 2π 2πc = [cm] . = kmax 2.82 kT T [K]

(4.5.20)

With the numerical factor 0.510, λT is obtained in units of cm. Inserting into (4.5.17), we ﬁnd

3 2π V V N = 0.244 = 2.70 3 . (4.5.21) 3 2.82 λT λT For the classical ideal gas,

V N λ3T

1; in contrast, the average spacing of the

photons (V /N )1/3 is, from (4.5.21), of the order of magnitude of λT , and therefore, they must be treated quantum mechanically.

4.5 The Photon Gas

205

At room temperature, i.e. T = 300 K, λT = 1.7 × 10−3 cm and the density is = 5.5 × 108 cm−3 . At the temperature of the interior of the Sun, i.e. T ≈ 22 −3 . In 10 K, λT = 5.1 × 10−8 cm and the density is N V = 2.0 × 10 cm −4 comparison, the wavelength of visible light is in the range λ = 10 cm. Note: If the photon had a ﬁnite rest mass m, then we would have g = 3. In that case, a factor of 32 would enter the Stefan–Boltzmann law. The experimentally demonstrated validity of the Stefan–Boltzmann law implies that either m = 0, or that the longitudinal photons do not couple to matter. N V 7

The chemical potential: The chemical potential of the photon gas can be computed from the Gibbs–Duhem relation E = T S − P V + µN , since we are dealing with a homogeneous system: µ=

1 1 (E − T S + P V ) = N N

„ « 16 4 σV T 3 4− + ≡0. 3 3 3c

(4.5.22)

The chemical potential of the photon gas is identical to 0 for all temperatures, because the number of photons is not ﬁxed, but rather adjusts itself to the temperature and the volume. Photons are absorbed and emitted by the surrounding matter, the walls of the cavity. In general, the chemical potential of particles and quasiparticles such as phonons, whose particle numbers are not subject to a conservation law, is zero. For example we consider the free energy of a ﬁctitious constant number of photons (phonons etc.), F (T, V, NPh ). since the number of photons (phonons) is“ not ﬁxed, it will adjust itself in such a way that the free energy is ” minimized,

∂F ∂NPh

= 0. This is however just the expression for the chemical T,V

potential, which therefore vanishes: 0. We could have just as well started from “ µ =” ∂S = − Tµ = 0. the maximization of the entropy, ∂N Ph

∗

E,V

4.5.5 Fluctuations in the Particle Number of Fermions and Bosons

Now that we have become acquainted with the statistical properties of various quantum gases, that is of fermions and bosons (including photons, whose particle-number distribution is characterized by µ = 0), we now want to investigate the ﬂuctuations of their particle numbers. For this purpose, we begin with the grand potential Φ = −β −1 log

e−β

P p

np (εp −µ)

.

(4.5.23)

{np }

Taking the derivative of Φ with respect to εq yields the mean value of nq : ∂Φ = ∂εq

nq e−β

{np }

{np }

e−β

P p

P p

np (εp −µ)

np (εp −µ)

= nq .

(4.5.24)

206

4. Ideal Quantum Gases

The second derivative of Φ yields the mean square deviation 0 / ∂2Φ 2 = −β n2q − nq ≡ −β(∆nq )2 . 2 ∂εq Thus, using

ex ex ∓1

=1±

(∆nq )2 = −β −1

1 ex ∓1 ,

(4.5.25)

we obtain

eβ(εq −µ) ∂nq = 2 = nq 1 ± nq . β(ε −µ) ∂εq e q ∓1

(4.5.26)

For fermions, the mean square deviation is always small. In the range of occupied states, where nq = 1, ∆nq is zero; and in the region of small nq , 1/2 ∆nq ≈ nq . Remark: For bosons, the ﬂuctuations can become very large. In the case of large occupation numbers, we have ∆nq ∼ n(q) and the relative deviation approaches one. This is a consequence of the tendency of bosons to cluster in the same state. These strong ﬂuctuations are also found in a spatial sense. If N bosons are enclosed in a volume of L3 , then the mean number of bosons in a subvolume a3 is given by n ¯ = N a3 /L3 . In the case that a λ, where λ is the extent of the wavefunctions of the bosons, one ﬁnds the mean square deviation of the particle number (∆Na3 )2 within the subvolume to be22 (∆Na3 )2 = n ¯ (¯ n + 1) . For comparison, we recall the quite diﬀerent behavior of classical particles, which obey a Poisson distribution (see Sect. 1.5.1). The probability of ﬁnding n particles in the subvolume a3 for a/L 1 and N → ∞ is then Pn = e−¯n

n ¯n n!

with n ¯ = N a3 /L3 , from which it follows that X (∆n)2 = n2 − n ¯2 = Pn n2 − n ¯2 = n ¯. n

The deviations of the counting rates of bosons from the Poisson law have been experimentally veriﬁed using intense photon beams.23

4.6 Phonons in Solids 4.6.1 The Harmonic Hamiltonian We recall the mechanics of a linear chain consisting of N particles of mass m which are coupled to their nearest neighbors by springs of force constant f . In the harmonic approximation, its Hamilton function takes on the form 22

23

A detailed discussion of the tendency of bosons to cluster in regions where their wavefunctions overlap may be found in E. M. Henley and W. Thirring, Elementary Quantum Field Theory, McGraw Hill, New York 1962, p. 52ﬀ. R. Hanbury Brown and R. Q. Twiss, Nature 177, 27 (1956).

4.6 Phonons in Solids

H = W0 +

n

m 2 f u˙ + (un − un−1 )2 2 n 2

.

207

(4.6.1)

One obtains expression (4.6.1) by starting from the Hamilton function of N particles whose positions are denoted by xn . Their equilibrium positions are x0n , where for an inﬁnite chain or a ﬁnite chain with periodic boundary conditions, the equilibrium positions have exact translational invariance and the distance between neighboring equilibrium positions is given by the lattice constant a = x0n+1 − x0n . One then introduces the displacements from the equilibrium positions, un = xn − x0n , and expands in terms of the un . The quantity W0 is given by the value of the overall potential energy W ({xn }) of the chain in the equilibrium positions. Applying the canonical transformation m −ikan 1 ikan un = √ e Qk , mu˙ n = e Pk , (4.6.2) N Nm k k we can transform H into a sum of uncoupled harmonic oscillators H = W0 +

1 k

2

(Pk P−k + ωk2 Qk Q−k ) ,

where the frequencies are related to the wavenumber via f ka sin . ωk = 2 m 2

(4.6.1 )

(4.6.3)

The Qk are called normal coordinates and the Pk normal momenta. The Qk and Pk are conjugate variables, which we will take to be quantum-mechanical operators in what follows. In the quantum representation, commutation rules hold: [un , mu˙ n ] = iδnn ,

[un , un ] = [mu˙ n , mu˙ n ] = 0

which in turn imply that [Qk , Pk ] = iδkk ,

[Qk , Qk ] = [Pk , Pk ] = 0 ;

furthermore, we have Q†k = Q−k and Pk† = P−k . Finally, by introducing the creation and annihilation operators ωk † Qk = a−k − a†k , a + a−k , Pk = −i (4.6.4) 2ωk k 2 we obtain H = W0 +

k

1 ˆk + ωk n 2

(4.6.1 )

208

4. Ideal Quantum Gases

with the occupation (number) operator n ˆ k = a†k ak

(4.6.5)

and [ak , a†k ] = δkk , [ak , ak ] = [a†k , a†k ] = 0. In this form, we can readily generalize the Hamiltonian to three dimensions. In a three-dimensional crystal with one atom per unit cell, there are three lattice vibrations for each wavenumber, one longitudinal (l) and two transverse (t1 , t2 ) (see Fig. 4.25). If the unit cell contains s atoms, there are 3s lattice vibrational modes. These are composed of the three acoustic modes, whose frequencies vanish at k = 0, and the 3(s − 1) optical phonon modes, whose frequencies are ﬁnite at k = 0.24

Fig. 4.25. The phonon frequencies in a crystal with one atom per unit cell

We shall limit ourselves to the simple case of a single atom per unit cell, i.e. to Bravais-lattice crystals. Then, according to our above considerations, the Hamiltonian is given by:

1 H = W0 (V ) + . (4.6.6) ˆ k,λ + ωk,λ n 2 k,λ

Here, we have characterized the lattice vibrations in terms of their wavevector k and their polarization λ. The associated frequency is ωk,λ and the operator for the occupation number is n ˆ k,λ . The potential energy W0 (V ) in the equilibrium lattice locations of the crystal depends on its lattice constant, or, equivalently when the number of particles is ﬁxed, on the volume. For brevity, we combine the wavevector and the polarization into the form k ≡ (k, λ). In a lattice with a total of N atoms, there are 3N vibrational degrees of freedom.

24

See e.g. J. M. Ziman, Principles of the Theory of Solids, 2nd edition, Cambridge University Press, 1972.

4.6 Phonons in Solids

209

4.6.2 Thermodynamic Properties In analogy to the calculation for photons, we ﬁnd for the free energy ωk + kT log 1 − e−ωk /kT . (4.6.7) F = −kT log Z = W0 (V ) + 2 k

The internal energy is found from

∂ F E = −T 2 , ∂T T V

(4.6.8)

thus E = W0 (V ) +

ωk k

2

+

k

ωk

1 . eωk /kT − 1

(4.6.8 )

It is again expedient for the case of phonons to introduce the normalized density of states 1 δ(ω − ωk ) , (4.6.9) g(ω) = 3N k

where the prefactor has been chosen so that ∞ dω g(ω) = 1 .

(4.6.10)

0

Using the density of states, the internal energy can be written in the form: ∞ E = W0 (V ) + E0 + 3N

dω g(ω) 0

ω eω/kT

−1

,

(4.6.11)

where we have used E0 = k ωk /2 to denote the zero-point energy of the phonons. For the thermodynamic quantities, the precise dependence of the phonon frequencies on wavenumber is not important, but instead only their distribution, i.e. the density of states. Now, in order to determine the thermodynamic quantities such as the internal energy, we ﬁrst have to calculate the density of states, g(ω). For small k, the frequency of the longitudinal phonons is ωk,l = cl k, and that of the transverse phonons is ωk,t = ct k, the latter doubly degenerate; here, cl and ct are the longitudinal and transverse velocities of sound. Inserting these expressions into (4.6.9), we ﬁnd

1 V 1 V ω2 2 2 g(ω) = dk k [δ(ω − cl k) + 2δ(ω − ct k)] = + 3 . 3N 2π 2 N 6π 2 c3l ct (4.6.12)

210

4. Ideal Quantum Gases

Equation (4.6.12) applies only to low frequencies, i.e. in the range where the phonon dispersion relation is in fact linear. In this frequency range, the density of states is proportional to ω 2 , as was also the case for photons. Using (4.6.12), we can now compute the thermodynamic quantities for low temperatures, since in this temperature range. only low-frequency phonons are thermally excited. In the high-temperature limit, as we shall see, the detailed shape of the phonon spectrum is unimportant; instead, only the total number of vibrational modes is relevant. We can therefore treat this case immediately, also (Eq. 4.6.14). At low temperatures only low frequencies contribute, since frequencies ω kT / are suppressed by the exponential function in the integral (4.6.11). Thus the low-frequency result (4.6.12) for g(ω) can be used. Corresponding to the calculation for photons, we ﬁnd

V π2 k4 1 2 E = W0 (V ) + E0 + + 3 T4 . (4.6.13) 303 c3l ct At high temperatures, i.e. temperatures which are much higher than ωmax /k, where ωmax is the maximum frequency of the phonons, we ﬁnd for all frequen −1 cies at which g(ω) is nonvanishing that eω/kT − 1 ≈ kT ω , and therefore, it follows from (4.6.11) and (4.6.10) that E = W0 (V ) + E0 + 3N kT .

(4.6.14)

Taking the derivative with respect to temperature, we obtain from (4.6.13) and (4.6.14) in the low-temperature limit CV ∼ T 3 ;

(4.6.15)

this is Debye’s law. In the high-temperature limit, we have CV ≈ 3N k ,

(4.6.16)

the law of Dulong–Petit. At low temperatures, the speciﬁc heat is proportional to T 3 , while at high temperatures, it is equal to the number of degrees of freedom times the Boltzmann constant. In order to determine the speciﬁc heat over the whole range of temperatures, we require the normalized density of states g(ω) for the whole frequency range. The typical shape of g(ω) for a Bravais crystal24 is shown in Fig. 4.26. At small values of ω, the ω 2 behavior is clearly visible. Above the maximum frequency, g(ω) becomes zero. In intermediate regions, the density of states exhibits characteristic structures, so called van Hove singularities24 which result from the maxima, minima, and saddle points of the phonon dispersion relation; their typical form is shown in Fig. 4.27. An interpolation formula which is adequate for many purposes can be obtained by approximating the density of states using the Debye approximation:

4.6 Phonons in Solids

Fig. 4.26. The phonon density of states g(ω). Solid curve: a realistic density of states; dashed curve: the Debye approximation

gD (ω) =

Fig. 4.27. A phonon dispersion relation with maxima, minima, and saddle points, which express themselves in the density of states as van Hove singularities

3ω 2 3 Θ(ωD − ω) , ωD

with 1 1 V = 3 ωD 18π 2 N

1 2 + 3 3 cl ct

211

(4.6.17a)

.

(4.6.17b)

With the aid of (4.6.17a), the low-frequency expression (4.6.12) is extended to cover the whole range of frequencies and is cut oﬀ at the so called Debye frequency ωD , which is chosen in such a way that (4.6.10) is obeyed. The Debye approximation is also shown in Fig. 4.26. Inserting (4.6.17a) into (4.6.11), we obtain

ωD E = W0 (V ) + E0 + 3N k T D (4.6.18) kT with 3 D(x) = 3 x

x 0

dy y 3 . ey − 1

(4.6.19)

Taking the temperature derivative of (4.6.18), we obtain an expression for the speciﬁc heat, which interpolates between the two limiting cases of the Debye and the Dulong-Petit values (see Fig. 4.28). ∗ 4.6.3 Anharmonic Eﬀects and the Mie–Gr¨ uneisen Equation of State

So far, we have treated only the harmonic approximation. In fact, the Hamiltonian for phonons in a crystal also contains anharmonic terms, e.g.

212

4. Ideal Quantum Gases

Fig. 4.28. The heat capacity of a monatomic insulator. At low temperatures, CV ∼ T 3 ; at high temperatures, it is constant

Hint =

c(k1 , k2 )Qk1 Qk2 Q−k1 −k2

k1 ,k2

with coeﬃcients c(k1 , k2 ). Terms of this type and higher powers arise from the expansion of the interaction potential in terms of the displacements of the lattice components. These nonlinear terms are responsible for (i) the thermal expansion of crystals, (ii) the occurrence of a linear term in the speciﬁc heat at high T , (iii) phonon damping, and (iv) a ﬁnite thermal conductivity. These terms are also decisive for structural phase transitions. A systematic treatment of these phenomena requires perturbation-theory methods. The anharmonic terms have the eﬀect that the frequencies ωk depend on the lattice constants, i.e. on the volume V of the crystal. This eﬀect of the anharmonicity can be taken into account approximately by introducing a minor extension to the harmonic theory of the preceding subsection for the derivation of the equation of state. We take the volume derivative of the free energy F . In addition to the potential energy W0 of the equilibrium conﬁguration, also ωk , owing to the anharmonicities, depends on the volume V ; therefore, we ﬁnd for the pressure

∂F 1 ∂ log ωk 1 ∂W0 P =− − + ω /kT . (4.6.20) =− ωk ∂V T ∂V 2 e k ∂V −1 k

For simplicity, we assume that the logarithmic derivative of ωk with respect to the volume is the same for all wavenumbers (the Gr¨ uneisen assumption): ∂ log ωk 1 ∂ log ωk 1 = = −γ . ∂V V ∂ log V V

(4.6.21)

The material constant γ which occurs here is called the Gr¨ uneisen constant. The negative sign indicates that the frequencies become smaller on expansion of the lattice. We now insert (4.6.21) into (4.6.20) and compare with (4.6.8 ), thus obtaining, with EPh = E − W0 , the Mie–Gr¨ uneisen equation of state: EPh ∂W0 +γ . (4.6.22) ∂V V This formula applies to insulating crystals in which there are no electronic excitations and the thermal behavior is determined solely by the phonons. P =−

4.7 Phonons und Rotons in He II

213

From the Mie–Gr¨ uneisen equation of state, the various thermodynamic derivatives can be obtained, such as the thermal pressure coeﬃcient (3.2.5)

∂P β= = γCV (T )/V (4.6.23) ∂T V and the linear expansion coeﬃcient (cf. Appendix I., Table I.3)

∂V 1 αl = , 3V ∂T P which, owing to the form αl =

∂P ∂T V

=−

∂V 1 ∂V ∂T

P

∂P

T

≡

( ∂V ∂T )P κT V

1 βκT . 3

(4.6.24)

, can also be given in

(4.6.25)

In this last relation, at low temperatures the compressibility can be replaced by κT (0) = −

1 V

∂V ∂P

= T =0

−1

∂ 2 W0 V . ∂V 2

(4.6.26)

At low temperatures, from Eqns. (4.6.23) and (4.6.25), the coeﬃcient of thermal expansion and the thermal pressure coeﬃcient of an insulator, as well as the speciﬁc heat, are proportional to the third power of the temperature: α ∝ β ∝ T3 . As a result of the thermodynamic relationship of the speciﬁc heats (3.2.24), we ﬁnd CP −CV ∝ T 7 . Therefore, at temperatures below the Debye temperature, the isobaric and the isochoral speciﬁc heats are practically equal. In analogy to the phonons, one can determine the thermodynamic properties of other quasiparticles. Magnons in antiferromagnetic materials likewise have a linear dispersion relation at small values of k and therefore, their contribution to the speciﬁc heat is also proportional to T 3 . Magnons in ferromagnets have a quadratic dispersion relation ∼ k 2 , leading to a speciﬁc heat ∼ T 3/2 .

4.7 Phonons und Rotons in He II 4.7.1 The Excitations (Quasiparticles) of He II At the conclusion of our treatment of the Bose–Einstein condensation in 4.4, we discussed the phase diagram of 4 He. In the He II phase below Tλ = 2.18 K, 4 He undergoes a condensation. States with the wavenumber 0 are occupied

214

4. Ideal Quantum Gases

macroscopically. In the language of second quantization, this means that the expectation value of the ﬁeld operator ψ(x) is ﬁnite. The order parameter here is ψ(x).25 The excitation spectrum is then quite diﬀerent from that of a system of free bosons. We shall not enter into the quantum-mechanical theory here, but instead use the experimental results as starting point. At low temperatures, only the lowest excitations are relevant. In Fig. 4.29, we show the excitations as determined by neutron scattering.

Fig. 4.29. The quasiparticle excitations in superﬂuid 4 He: phonons and rotons after Henshaw and Woods.26

The excitation spectrum exhibits the following characteristics: for small values of p, the excitation energy depends linearly on the momentum εp = cp .

(4.7.1a)

In this region, the excitations are called phonons, whose velocity of sound is c = 238 m/sec. A second characteristic of the excitation spectrum is a minimum at p0 = 1.91 ˚ A−1. In this range, the excitations are called rotons, and they can be represented by εp = ∆ + 25

(|p| − p0 )2 , 2µ

We have a0 |φ0 (N ) = a†0

26

|φ0 (N ) =

√ √

N |φ0 (N − 1) ≈

(4.7.1b) √

N |φ0 (N ) √ N + 1 |φ0 (N + 1) ≈ N |φ0 (N ) ,

since due to the macroscopic occupation of the ground state, N 1. See for example QM II, Sect. 3.2.2. D. G. Henshaw and A. D. Woods, Phys. Rev. 121, 1266 (1961)

4.7 Phonons und Rotons in He II

215

with an eﬀective mass µ = 0.16 mHe and an energy gap ∆/k = 8.6 K. These properties of the dispersion relations will make themselves apparent in the thermodynamic properties. 4.7.2 Thermal Properties At low temperatures, the number of excitations is small, and their interactions can be neglected. Since the 4 He atoms are bosons, the quasiparticles in this system are also bosons.27 We emphasize that the quasiparticles in Eqns. (4.7.1a) and (4.7.1b) are collective density excitations, which have nothing to do with the motions of individual helium atoms. As a result of the Bose character and due to the fact that the number of quasiparticles is not conserved, i.e. the chemical potential is zero, we ﬁnd for the mean occupation number −1 n(εp ) = eβεp − 1 . (4.7.2) From this, the free energy follows: kT V d3 p log 1 − e−βεp , F (T, V ) = (2π)3 and for the average number of quasiparticles V d3 p n(εp ) NQP (T, V ) = (2π)3 and the internal energy V d3 p εp n(εp ) . E(T, V ) = (2π)3

(4.7.3a)

(4.7.3b)

(4.7.3c)

At low temperatures, only the phonons and rotons contribute in (4.7.3a) through (4.7.3c), since only they are thermally excited. The contribution of the phonons in this limit is given by Fph = −

π 2 V (kT )4 , 90(c)3

or Eph =

π 2 V (kT )4 . 30(c)3

(4.7.4a,b)

From this, we ﬁnd for the heat capacity at constant volume: CV = 27

2π 2 V k 4 T 3 . 15(c)3

(4.7.4c)

In contrast, in interacting fermion systems there can be both Fermi and Bose quasiparticles. The particle number of bosonic quasiparticles is in general not ﬁxed. Additional quasiparticles can be created; since the changes in the angular momentum of every quantum-mechanical system must be integral, these excitations must have integral spins.

216

4. Ideal Quantum Gases

Due to the gap in the roton energy (4.7.1b), the roton occupation number at low temperatures T ≤ 2 K can be approximated by n(εp ) ≈ e−βεp , and we ﬁnd for the average number of rotons V Nr ≈ (2π)3 =

3

d pe

V e−β∆ 2π 2 3

∞

−βεp

V = 2 3 2π

dp p2 e−β(p−p0 )

∞

dp p2 e−βεp

0 2

/2µ

0

V ≈ 2 3 e−β∆ p20 2π

∞

dp e−β(p−p0 )

−∞

2

/2µ

=

1/2 −β∆ V p20 2πµkT e . 2 3 2π (4.7.5a)

The contribution of the rotons to the internal energy is

V ∂ kT 3 −βεp N Nr , Er ≈ d p ε e = − = ∆ + p r (2π)3 ∂β 2 from which we obtain the speciﬁc heat )

2 * ∆ ∆ 3 + + Nr , Cr = k 4 kT kT

(4.7.5b)

(4.7.5c)

where from (4.7.5a), Nr goes exponentially to zero for T → 0. In Fig. 4.30, the speciﬁc heat is drawn in a log-log plot as a function of the temperature. The straight line follows the T 3 law from Eq. (4.7.4c). Above 0.6 K, the roton contribution (4.7.5c) becomes apparent.

Fig. 4.30. The speciﬁc heat of helium II under the saturated vapor pressure (Wiebes, NielsHakkenberg and Kramers).

4.7 Phonons und Rotons in He II ∗

217

4.7.3 Superﬂuidity and the Two-Fluid Model

The condensation of helium and the resulting quasiparticle dispersion relation (Eq. 4.7.1a,b, Fig. 4.29) have important consequences for the dynamic behavior of 4 He in its He II phase. Superﬂuidity and its description in terms of the two-ﬂuid model are among them. To see this, we consider the ﬂow of helium through a tube in two diﬀerent inertial frames. In frame K, the tube is at rest and the liquid is ﬂowing at the velocity −v. In frame K0 , we suppose the helium to be at rest, while the tube moves with the velocity v (see Fig. 4.31).

Fig. 4.31. Superﬂuid helium in the rest frame of the tube, K, and in the rest frame of the liquid, K0

The total energies (E, E0 ) and the total momenta (P, P0 ) of the liquid in the two frames (K,K0 ) are related by a Galilei transformation. P = P0 − M v

(4.7.6a)

E = E0 − P0 · v +

2

Mv . 2

Here, we have used the notation pi = P , pi0 = P0 , i

(4.7.6b)

i

mi = M .

(4.7.6c)

i

One can derive (4.7.6a,b) by applying the Galilei transformation for the individual particles xi = xi0 − vt

,

pi = pi0 − mv .

This gives for the total momentum X X (pi0 − mv) = P0 − M v P= pi = and for the total energy E=

”2 X X m “ pi0 X 1 2 X V (xi − xj ) = V (xi0 − xj0 ) pi + −v + 2m 2 m i i

i,j

i,j

X p2i0 M 2 X M 2 = V (xi0 − xj0 ) = E0 − P0 · v + − P0 · v + v + v . 2m 2 2 i

i,j

218

4. Ideal Quantum Gases

In an ordinary liquid, any ﬂow which might initially be present is damped by friction. Seen from the frame K0 , this means that in the liquid, excitations are created which move along with the walls of the tube, so that in the course of time more and more of the liquid is pulled along with the moving tube. Seen from the tube frame K, this process implies that the ﬂowing liquid is slowed down. The energy of the liquid must simultaneously decrease in order for such excitations to occur at all. We now need to investigate whether for the particular excitation spectrum of He II, Fig. 4.29, the ﬂowing liquid can reduce its energy by the creation of excitations. Is it energetically favorable to excite quasiparticles? We ﬁrst consider helium at the temperature T = 0, i.e. in its ground state. In the ground state, energy and momentum in the frame K0 are given by E0g

and P0 = 0 .

(4.7.7a)

It follows for these quantities in the frame K: E g = E0g +

M v2 2

and P = −M v .

(4.7.7b)

If a quasiparticle with momentum p and energy εp is created, the energy and the momentum in the frame K0 have the values E0 = E0g + εp

and P0 = p ,

(4.7.7c)

and from (4.7.6a,b) we ﬁnd for the energy in the frame K: E = E0g + εp − p · v +

M v2 2

and P = p − M v .

(4.7.7d)

The excitation energy in K (the tube frame) is thus ∆E = εp − p · v .

(4.7.8)

∆E is the energy change of the liquid due to the appearance of an excitation in frame K. Only when ∆E < 0 does the ﬂowing liquid reduce its energy. Since ε − pv is a minimum when p is parallel to v, the inequality v>

ε p

(4.7.9a)

must be obeyed for an excitation to occur. From (4.7.9a) we ﬁnd the critical velocity (Fig. 4.32)

ε vc = ≈ 60 m/sec . (4.7.9b) p min If the ﬂow velocity is smaller than vc , no quasiparticles will be excited and the liquid ﬂows unimpeded and loss-free through the tube. This phenomenon

4.7 Phonons und Rotons in He II

219

Fig. 4.32. Quasiparticles and the critical velocity

is called superﬂuidity. The occurrence of a ﬁnite critical velocity is closely connected to the shape of the excitation spectrum, which has a ﬁnite group velocity at p = 0 and is everywhere greater than zero (Fig. 4.32). The value (4.7.9b) of the critical velocity is observed for the motion of ions in He II. The critical velocity for ﬂow in capillaries is much smaller than vc , since vortices occur already at lower velocities; we have not considered these excitations here. A corresponding argument holds also for the formation of additional excitations at nonzero temperatures. At ﬁnite temperatures, thermal excitations of quasiparticles are present. What eﬀect do they have? The quasiparticles will be in equilibrium with the moving tube and will have the average velocity of the frame K0 , v. The condensate, i.e. the superﬂuid component, is at rest in K0 . The quasiparticles have momentum p and an excitation energy of εp in K0 . The mean number of these quasiparticles is n(εp − p · v). (One has to apply the equilibrium distribution functions in the frame in which the quasiparticle gas is at rest! – and there, the excitation energy is εp − p · v). The momentum of the quasiparticle gas in K0 is given by V d3 p p n(εp − p · v) . (4.7.10) P0 = (2π)3 For low velocities, we can expand (4.7.10) in terms of v. Using d3 p p n(εp ) = 0 and terminating the expansion at ﬁrst order in v, we ﬁnd ∂n −V ∂n −V 1 3 d3 p p2 P0 ≈ d p p(p · v) = v , (2π)3 ∂εp (2π)3 3 ∂εp where d3 p pi pj f (|p|) = 13 δij d3 p p2 f (|p|) was used. At low T , it suﬃces to take the phonon contribution in this equation into account, i.e. P0,ph

4πV 1 =− v 5 3 (2π) 3c

∞ dε ε4

∂n . ∂ε

(4.7.11)

0

After integration by parts and replacement of 4π obtains

dε ε2 /c3 by

d3 p, one

220

4. Ideal Quantum Gases

P0,ph =

V 4 v 2 3 (2π) 3c

d3 p εp n(εp ) .

We write this result in the form P0,ph = V ρn,ph v ,

(4.7.12)

where we have deﬁned the normal ﬂuid density by ρn,ph =

4 Eph 2π 2 (kT )4 = ; 3 V c2 45 3 c5

(4.7.13)

compare (4.7.4b). In (4.7.13), the phonon contribution to ρn is evaluated. The contribution of the rotons is given by ρn,r =

p20 Nr . 3kT V

(4.7.14)

Eq. (4.7.14) follows from (4.7.10) using similar approximations as in the determination of Nr in Eq. (4.7.5a). One calls ρn = ρn,ph +ρn,r the mass density of the normal component. Only this portion of the density reaches equilibrium with the walls. Using (4.7.10) and (4.7.12), the total momentum per unit volume, P0 /V , is found to be given by P0 /V = ρn v .

(4.7.15)

We now carry out a Galilei transformation from the frame K0 , in which the condensate is at rest, to a frame in which the condensate is moving at the velocity vs . The quasiparticle gas, i.e. the normal component, has the velocity vn = v + vs in this reference frame. The momentum is found from (4.7.15) by adding ρvs due to the Galilei transformation: P/V = ρvs + ρn v . If we substitute v = vn − vs , we can write the momentum in the form P/V = ρs vs + ρn vn ,

(4.7.16)

where the superﬂuid density is deﬁned by ρ s = ρ − ρn .

(4.7.17)

Similarly, the free energy in the frame K0 can be calculated, and from it, by means of a Galilei transformation, the free energy per unit volume of the ﬂowing liquid in the frame in which the superﬂuid component is moving at vs (problem 4.23): 1 1 F (T, V, vs , vn )/V = F (T, V )/V + ρs vs2 + ρn vn2 , 2 2

(4.7.18)

where the free energy of the liquid at rest, F (T, V ) is given by (4.7.3a) and the relations which follow it.

Problems for Chapter 4

221

Fig. 4.33. The superﬂuid and the normal density ρs and ρn in He II as functions of the temperature, measured using the motion of a torsional oscillator by Andronikaschvili.

The hydrodynamic behavior of the helium in the He II phase is as would be expected if the helium consisted of two ﬂuids, a normal ﬂuid with the density ρn , which reaches equilibrium with obstacles such as the inner wall of a tube in which it is ﬂowing, and a superﬂuid with the density ρs , which ﬂows without resistance. When T → 0, ρs → ρ and ρn → 0; for T → Tλ ρs → 0 and ρn → ρ. This theoretical picture, the two–ﬂuid model of Tisza and Landau, was experimentally conﬁrmed by Andronikaschvili, among others (Fig. 4.33). It provides the theoretical basis for the fascinating macroscopic properties of superﬂuid helium.

Problems for Chapter 4 4.1 Demonstrate the validity of equations (4.3.24a) and (4.3.24b). 4.2 Show that the entropy of an ideal Bose (Fermi) gas can be formulated as follows: X“ ` ´ ` ´” −np log np ± 1 ± np log 1 ± np . S=k p

Consider this expression in the classical limit, also, as well as in the limit T → 0.

4.3 Calculate CV , CP , κT , and α for ideal Bose and Fermi gases in the limit of extreme dilution up to the order 3 . 4.4 Estimate the Fermi energies (in eV) and the Fermi temperatures (in K) for the following systems (in the free–particle approximation: εF = (a) Electrons in metal (b) Neutrons in a heavy nucleus A3 ). (c) 3 He in liquid 3 He (V /N = 46.2 ˚

2 2m

„

N V

«2/3„

6π 2 g

«2/3

):

222

4. Ideal Quantum Gases

4.5 Consider a one–dimensional electron gas (S = 1/2), consisting of N particles conﬁned to the interval (0, L). (a) What are the values of the Fermi momentum pF and the Fermi energy εF ? (b) Calculate, in analogy to Sect. 4.3, µ = µ(T, N/L). i h “ ”2 2 Result: pF = πN , µ = εF 1 + π12 kT + O(T 4 ) . L εF Give a qualitative explanation of the diﬀerent sign of the temperature dependence when compared to the three-dimensional case.

4.6 Calculate the chemical potential µ(T, N/V ) for a two-dimensional Fermi gas. ˙

¸

4.7 Determine the mean square deviation (∆N )2 = N 2 − N 2 of the number of electrons for an electron gas in the limit of zero temperature.

4.8 Calculate the isothermal compressibility (Eq. (4.3.18)) of the electron gas at low temperatures, starting from the formula (4.3.14 ) for the pressure, P = 25 εFVN + 2 π 2 (kT ) N 6 εF V

. Compare with the mean square deviation of the particle number found in problem 4.7.

4.9 Compute the free energy of the nearly degenerate Fermi gas, as well as α and CP .

4.10 Calculate for a completely relativistic Fermi gas (εp = pc) (a) the grand potential Φ (b) the thermal equation of state (c) the speciﬁc heat CV . Consider also the limiting case of very low temperatures.

4.11 p (a) Calculate the ground state energy of a relativistic electron gas,

Ep = (me c2 )2 + (pc)2 , in a white dwarf star, which contains N electrons and N/2 helium nuclei (at rest), and give the zero–point pressure for the two limiting cases me c2 2 xF v5 „ « 1 me c2 1 − ; xF 1 : P0 = x F v4 x2F

xF 1 : P0 =

xF =

pF . me c

How does the pressure depend on the radius R of the star?

(b) Derive the relation between the mass M of the star and its radius R for the two cases xF 1 and xF 1, and show that a white dwarf can have no greater mass than r „ «3/2 c 9mp 3π M0 = . 64 α3 γm2p α∼1, G = 6.7 × 10−8 dyn cm2 g−2 −24

mp = 1.7 × 10

g

Gravitational Proton

constant

mass

(c) If a star of a given mass M = 2mp N is compressed to a (ﬁnite) radius R, then its energy is reduced by the self–energy Eg of gravitation, which for a homogeneous

Problems for Chapter 4

223

mass distribution has the form Eg = −αGM 2 /R, where α is a number of the order of 1. From dR dEg dE0 + =0 dV dV dR you can determine the equilibrium radius, with dE0 = −P0 (R) 4πR2 dR as the diﬀerential of the ground–state energy.

4.12 Show that in a two–dimensional ideal Bose gas, there can be no Bose–Einstein condensation. 4.13 Prove the formulas (4.4.10) and (4.4.11) for the entropy and the speciﬁc heat of an ideal Bose gas. 4.14 Compute the internal energy of the ideal Bose gas for T < Tc (v). From the result, determine the speciﬁc heat (heat capacity) and compare it with Eq. (4.4.11).

4.15 Show for bosons with εp = aps and µ = 0 that the speciﬁc heat at low temperatures varies as T 3/s in three dimensions. In the special case of s = 2, this yields the speciﬁc heat of a ferromagnet where these bosons are spin waves. 4.16 Show that the maximum in Planck’s formula for the energy distribution u(ω) is at ωmax = 2.82 kT ; see (4.5.10). 4.17 Conﬁrm that the energy ﬂux IE (T ) which is emitted by a black body of temperature T into one hemisphere is given by (Eq. (4.5.16)), IE (T ) ≡ cE = σT 4 , starting from the energy current density 4V jE =

energy emitted cm2 sec

=

1 X p c εp np,λ . V p p,λ

df The energy ﬂux IE per unit area through a surface element of df is jE |df . |

4.18 The energy ﬂux which reaches the Earth from the Sun is equal to b = 0, 136 Joule sec−1 cm−2 (without absorption losses, for perpendicular incidence). b is called the solar constant. (a) Show that the total emission from the Sun is equal to 4 × 1026 Joule sec−1 . (b) Calculate the surface temperature of the Sun under the assumption that it radiates as a black body (T ∼ 6000 K). RS = 7 × 1010 cm, RSE = 1 AU = 1.5 × 1013 cm 4.19 Phonons in a solid: calculate the contribution of the so called optical phonons to the speciﬁc heat of a solid, taking the dispersion relation of the vibrations to be ε(k) = ωE (Einstein model). 4.20 Calculate the frequency distribution corresponding to Equation (4.6.17a) for a one- or a two–dimensional lattice. How does the speciﬁc heat behave at low temperatures in these cases? (examples of low–dimensional systems are selenium (one–dimensional chains) and graphite (layered structure)).

224

4. Ideal Quantum Gases

0 4.21 The pressure of a solid is given by P = − ∂W +γ ∂V

Eph V

(see (4.6.22)). Show, under the assumption that W0 (V ) = (V −V0 ) /2χ0 V0 for V ∼ V0 and χ0 CV T V0 , that the thermal expansion (at constant P ∼ 0) can be expressed as „ « γ 2 χ0 CV2 T γχ0 CV 1 ∂V and CP − CV = . = α≡ V ∂T V0 V0 2

4.22 Speciﬁc heat of metals: compare the contributions of phonons and electrons. Show that the linearpcontribution to the speciﬁc heat becomes predominant only at T < T ∗ = 0.14θD θD /TF . Estimate T ∗ for typical values of θD and TF .

4.23 Superﬂuid helium: show that in a coordinate frame in which the superﬂuid component is at rest, the free energy F = E − T S is given by h i 1X log 1 − e−β(εp −p·v) . Φv + ρn v 2 , where Φv = β p

Expand Φv and show also that in the system in which the superﬂuid component is moving at a velocity vs , F = Φ0 +

ρs vs2 ρn vn2 + ; 2 2

vn = v + vs .

Hint: In determining the free energy F , note that the distribution function n for the quasiparticles with energy εp is equal to n(εp − p · v).

4.24 Ideal Bose and Fermi gases in the canonical ensemble: (a) Calculate the canonical partition function for ideal Bose and Fermi gases. (b) Calculate the average occupation number in the canonical ensemble. Suggestion: instead of ZN , compute the quantity ∞ X

Z(x) =

xN ZN

N=0

H Z(x) 1 and determine ZN using ZN = 2πi dx, where the path in the complex x xN +1 plane encircles the origin, but includes no singularities of Z(x). Use the saddle– point method for evaluating the integral.

4.25 Calculate the chemical potential µ for the atomic limit of the Hubbard model, H=U

N X

ni↑ ni↓ ,

i=1

where ni↑ = c†i↑ ci↑ is the number operator for electrons in the state i (at lattice site i) and σ = + 12 . (In the general case, which is not under consideration here, the Hubbard model is given by: X X tij c†iσ cjσ + U ni↑ ni↓ . ) H= ijσ

i

5. Real Gases, Liquids, and Solutions

In this chapter, we consider real gases, that is we take the interactions of the atoms or molecules and their structures into account. In the ﬁrst section, the extension from the classical ideal gas will involve only including the internal degrees of freedom. In the second section, we consider mixtures of such ideal gases. The following sections take the interactions of the molecules into account, leading to the virial expansion and the van der Waals theory of the liquid and the gaseous phases. We will pay special attention to the transition between these two phases. In the ﬁnal section, we investigate mixtures. This chapter also contains references to every-day physics. It touches on bordering areas with applications in physical chemistry, biology, and technology.

5.1 The Ideal Molecular Gas 5.1.1 The Hamiltonian and the Partition Function We consider a gas consisting of N molecules, enumerated by the index n. In addition to their translational degrees of freedom, which we take to be classical as before, we now must consider the internal degrees of freedom (rotation, vibration, electronic excitation). The mutual interactions of the molecules will be neglected. The overall Hamiltonian contains the translational energy (kinetic energy of the molecular motion) and the Hamiltonian for the internal degrees of freedom Hi,n , summed over all the molecules: H=

N 2 p

n

n=1

2m

+ Hi,n

.

(5.1.1)

The eigenvalues of Hi,n are the internal energy levels εi,n . The partition function is given by P VN 3 3 − n p2n /2mkT Z(T, V, N ) = d p . . . d p e e−εi,n /kT . 1 N (2π)3N N ! n ε i,n

The classical treatment of the translational degrees of freedom, represented by the partition integral over momenta, is justiﬁed when the speciﬁc volume

226

5. Real Gases, Liquids, and Solutions

√ is much larger than the cube of the thermal wavelength λ = 2π/ 2πmkT (Chap. 4). Since the internal energy levels εi,n ≡ εi are identical for all of the molecules, it follows that N 1 1 V [Ztr (1) Zi ]N = Z(T, V, N ) = Z , (5.1.2) i N! N ! λ3 where Zi = εi e−εi /kT is the partition function over the internal degrees of freedom and Ztr (1) is the translational partition integral for a single molecule. From (5.1.2), we ﬁnd the free energy, using the Stirling approximation for large N : V F = −kT log Z ≈ −N kT 1 + log (5.1.3) + log Zi . N λ3 From (5.1.3), we obtain the equation of state

∂F N kT , P =− = ∂V T,N V

(5.1.4)

which is the same as that of a monatomic gas, since the internal degrees of freedom do not depend on V . For the entropy, we have

∂F ∂ log Zi V 5 S=− , (5.1.5a) = Nk + log Zi + T + log ∂T V,N 2 N λ3 ∂T and from it, we obtain the internal energy, 3 ∂ log Zi E = F + T S = N kT +T . 2 ∂T

(5.1.5b)

The caloric equation of state (5.1.5b) is altered by the internal degrees of freedom compared to that of a monatomic gas. Likewise, the internal degrees of freedom express themselves in the heat capacity at constant volume,

∂ 2 ∂ log Zi ∂E 3 CV = + T . (5.1.6) = Nk ∂T V,N 2 ∂T ∂T Finally, we give also the chemical potential for later applications:

∂F V µ= ; = −kT log Z i ∂N T,V N λ3

(5.1.5c)

it agrees with µ = N1 (F + P V ), since we are dealing with a homogeneous system. To continue the evaluation, we need to investigate the contributions due to the internal degrees of freedom. The energy levels of the internal degrees of freedom are composed of three contributions:

5.1 The Ideal Molecular Gas

εi = εel + εrot + εvib .

227

(5.1.7)

Here, εel refers to the electronic energy including the Coulomb repulsion of the nuclei relative to the energy of widely separated atoms. εrot is the rotational energy and εvib is the vibrational energy of the molecules. We consider diatomic molecules containing two diﬀerent atoms (e.g. HCl; for identical atoms, cf. Sect. 5.1.4). Then the rotational energy has the form1 εrot =

2 l(l + 1) , 2I

(5.1.8a)

where l is the angular momentum quantum number and I = mred R02 the moment of inertia, depending on the reduced mass mred and the distance between the atomic nuclei, R0 .2 The vibrational energy εvib takes the form1

1 , (5.1.8b) εvib = ω n + 2 where ω is the frequency of the molecular vibration and n = 0, 1, 2, . . .. The electronic energy levels εel can be compared to the dissociation energy εDiss . Since we want to consider non-dissociated molecules, i.e. we require that kT εDiss , and on the other hand the excitation energies of the lowest electronic levels are of the same order of magnitude as εDiss , it follows from the condition kT εDiss that the electrons must be in their ground state, whose energy we denote by ε0el . Then we have

0 ε (5.1.9) Zi = exp − el Zrot Zvib . kT We now consider in that order the rotational part Zrot and the vibrational part Zvib of the partition function. 5.1.2 The Rotational Contribution Since the rotational energy εrot (5.1.8a) does not depend on the quantum number m (the z component of the angular momentum), the sum over m just yields a factor (2l + 1), and only the sum over l remains, which runs over all the natural numbers

∞ l(l + 1)Θr Zrot = . (5.1.10) (2l + 1) exp − 2T l=0

1

2

In general, the moment of inertia I and the vibration frequency ω depend2 on l. The latter dependence leads to a coupling of the rotational and the vibrational degrees of freedom. For the following evaluation we have assumed that these dependences are weak and can be neglected. See e.g. QM I

228

5. Real Gases, Liquids, and Solutions

Here, we have introduced the characteristic temperature Θr =

2 . Ik

(5.1.11)

We next consider two limiting cases: T Θr : At low temperatures, only the smallest values of l contribute in (5.1.10) Zrot = 1 + 3 e−Θr /T + 5 e−3Θr /T + O e−6Θr /T . (5.1.12) T Θr : At high temperatures, the sum must be carried out over all l values, leading to

1 Θr Θr 2 T 1 Zrot = 2 +O . (5.1.13) + + Θr 3 30 T T To prove (5.1.13), one uses the Euler–MacLaurin summation formula3 ∞ X

Z∞ f (l) =

l=0

X (−1)k Bk (2k−1) 1 (0) + Restn , f (0) + f 2 (2k)! n−1

dl f (l) +

(5.1.14)

k=1

0

for the special case that f (∞) = f (∞) = . . . = 0. The ﬁrst Bernoulli numbers Bn 1 are given by B1 = 16 , B2 = 30 . The ﬁrst term in (5.1.14) yields just the classical result Z∞

Z∞ dl f (l) =

0

0

„

l(l + 1) Θr dl (2l + 1) exp − 2 T

«

Z∞ =2

dx e−x

Θr T

= 2

T , (5.1.15) Θr

0

which one would also obtain by treating the rotational energy classically instead of quantum-mechanically.4 The further terms are found via „ «3 „ «2 Θr Θr 1 Θr Θr − , f (0) = 1 , f (0) = 2 − , f (0) = −6 +3 2T T T 8 T from which, using (5.1.14), we obtain the expansion (5.1.13).

From (5.1.12) and (5.1.13), we ﬁnd for the logarithm of the partition function after expanding: 3

4

Whittaker, Watson, A Modern Course of Analysis, Cambridge at the Clarendon Press; V. I. Smirnow, A Course of Higher Mathematics, Pergamon Press, Oxford 1964: Vol. III, Part 2, p. 290. See e.g. A. Sommerfeld, Thermodynamics and Statistical Physics, Academic Press, NY 1950 Z Z βI 2 2 4πI 2 2IkT . dω dω2 e− 2 (ω1 +ω2 ) = Zrot = 1 2 (2π) 2

5.1 The Ideal Molecular Gas

log Zrot

⎧ 9 ⎪ 3 e−Θr /T − e−2Θr /T + O e−3Θr /T ⎪ ⎪ 2 ⎨ =

2T Θ ⎪ ⎪ 1 Θr 2 Θr 3 r ⎪ ⎩log + + +O Θr 6T 360 T T

229

T Θr T Θr . (5.1.16a)

From this result, the contribution of the rotational degrees of freedom to the internal energy can be calculated: ∂ Erot = N kT 2 log Zrot ∂T ⎧ −Θr /T ⎪3N k Θr e − 3 e−2Θr /T + . . . T Θr ⎪ ⎨ (5.1.16b) = 2 ⎪ ⎪ ⎩N kT 1 − Θr − 1 Θr + . . . T Θr . 6T 180 T The contribution to the heat capacity at constant volume is then ⎧ 2 Θr −Θr /T −Θr /T ⎪ ⎪ 3 1 − 6 e e + . . . T Θr ⎪ ⎨ T rot CV = N k (5.1.16c) ⎪ Θ 2 ⎪ 1 ⎪ r ⎩1 + + ... T Θr . 180 T In Fig. 5.1, we show the rotational contribution to the speciﬁc heat.

Fig. 5.1. The rotational contribution to the speciﬁc heat

At low temperatures, the rotational degrees of freedom are not thermally excited. Only at T ≈ Θr /2 do the rotational levels contribute. At high temperatures, i.e. in the classical region, the two rotational degrees of freedom make a contribution of 2kT /2 to the internal energy. Only with the aid of quantum mechanics did it become possible to understand why, in contradiction to the equipartition theorem of classical physics, the speciﬁc heat per molecule can diﬀer from the number of degrees of freedom multiplied by k/2. The rotational contribution to the speciﬁc heat has a maximum of 1.1 at the temperature 0.81 Θr /2 . For HCl, Θr /2 is found to be 15.02 K.

230

5. Real Gases, Liquids, and Solutions

5.1.3 The Vibrational Contribution We now come to the vibrational contribution, for which we introduce a characteristic temperature deﬁned by ω = kΘv .

(5.1.17)

We obtain the well-known partition function of a harmonic oscillator Zvib =

∞ n=0

e−(n+ 2 ) 1

Θv T

=

e−Θv /2T , 1 − e−Θv /T

(5.1.18)

−Θv /T v whose logarithm is given by log Zvib = − Θ . From it, we 2T − log 1 − e ﬁnd for the internal energy: 1 1 2 ∂ Evib = N k T log Zvib = N k Θv + Θ /T , (5.1.19a) v ∂T 2 e −1 and for the vibrational contribution to the heat capacity at constant volume CVvib = N k

eΘv /T 1 Θv2 Θv2 = N k .

2 T 2 eΘv /T − 1 T 2 [2sinh Θv /2T ]2

(5.1.19b)

At low and high temperatures, from (5.1.19b) we obtain the limiting cases ⎧ 2 Θv ⎪ ⎪ e−Θv /T + . . . T Θv ⎪ ⎨ vib T CV = (5.1.19c) ⎪ Nk Θ 2 ⎪ 1 ⎪ v ⎩1 − + . . . T Θv . 12 T The excited vibrational energy levels are noticeably populated only at temperatures above Θv . The speciﬁc heat (5.1.19b) is shown in Fig. 5.2.

Fig. 5.2. The vibrational part of the speciﬁc heat (Eq. (5.1.19b))

The contribution of the electronic energy ε0el to the partition function, free energy, internal energy, entropy, and to the chemical potential is, from (5.1.9):

5.1 The Ideal Molecular Gas

Zel = e

−ε0el /kT

,

Fel = N ε0el ,

Eel = N ε0el ,

Sel = 0 ,

231

µel = ε0el . (5.1.20)

These contributions play a role in chemical reactions, where the (outer) electronic shells of the atoms undergo complete restructuring. In a diatomic molecular gas, there are three degrees of freedom due to translation, two degrees of freedom of rotation, and one vibrational degree p2 2 2 of freedom, which counts double (E = 2m +m 2 ω x ; kinetic and potential 1 energy each contribute 2 kT ). The classical speciﬁc heat is therefore 7k/2, as is observed experimentally at high temperatures. All together, this gives the temperature dependence of the speciﬁc heat as shown in Fig. 5.3. The curve is not continued down to a temperature of T = 0, since there the approximation of a classical ideal gas is certainly no longer valid.

Fig. 5.3. The speciﬁc heat of a molecular gas at constant volume (schematic)

The rotational levels correspond to a wavelength of λ = 0.1 − 1 cm and lie in the far infrared and microwave regions, while the vibrational levels at wavelengths of λ = 2 × 10−3 − 3 × 10−3 cm are in the infrared. The corresponding energies are 10−3 −10−4 eV and 0.06−0.04 eV, resp. (Fig. 5.4). ∧ One electron volt corresponds to about 11000 K (1 K = 0.86171 × 10−4 eV). Some values of Θr and Θv are collected in Table 5.1. In more complicated molecules, there are three rotational degrees of freedom and more vibrational degrees of freedom (for n atoms, in general 3n − 6 vibrational degrees of freedom, and for linear molecules, 3n − 5). In precise experiments, the coupling between the vibrational and rotational degrees of freedom and the anharmonicities in the vibrational degrees of freedom are also detected.

1 Θ 2 r

[K] Θv [K]

H2

HD

D2

HCl

O2

85 6100

64 5300

43 4300

15 4100

2 2200

Table 5.1. The values of Θr /2 and Θv for several molecules

232

5. Real Gases, Liquids, and Solutions

Fig. 5.4. The structure of the rotational and vibrational levels (schematic) ∗

5.1.4 The Inﬂuence of the Nuclear Spin

We emphasize from the outset that here, we make the assumption that the electronic ground state has zero orbital and spin angular momenta. For nuclei A and B, which have diﬀerent nuclear spins SA and SB , one obtains an additional factor in the partition function, (2SA + 1)(2SB + 1), i.e.Zi → (2SA + 1)(2SB + 1)Zi . This leads to an additional term in the free energy per molecule of −kT log(2SA + 1)(2SB + 1), and to a contribution of k log(2SA + 1)(2SB + 1) to the entropy, i.e. a change of the chemical constants by log(2SA + 1)(2SB + 1) (see Eq. (3.9.29) and (5.2.5 )). As a result, the internal energy and the speciﬁc heats remain unchanged. For molecules such as H2 , D2 , O2 which contain identical atoms, one must observe the Pauli principle. We consider the case of H2 , where the spin of the individual nuclei is SN = 1/2. Ortho hydrogen molecule:

Nuclear spin triplet (Stot = 1); the spatial wavefunction of the nuclei is antisymmetric (l = odd (u))

Para hydrogen molecule:

Nuclear spin singlet (Stot = 0); the spatial wavefunction of the nuclei is symmetric (l = even (g))

l(l + 1) Θr (2l + 1) exp − 2 T l odd(u)

l(l + 1) Θr . Zg = (2l + 1) exp − 2 T

Zu =

l even(g)

(5.1.21a) (5.1.21b)

5.1 The Ideal Molecular Gas

233

In complete equilibrium, we have Z = 3Zu + Zg . At T = 0, the equilibrium state is the ground state l = 0, i.e. a para state. In fact, owing to the slowness of the transition between the two spin states at T = 0, a mixture of ortho and para hydrogen will be obtained. At high temperatures, Zu ≈ Zg ≈ 12 Zrot = ΘTr holds and the mixing ratio of ortho to para hydrogen is 3:1. If we start from this state and cool the sample, then, leaving ortho-para conversion out of consideration, H2 consists of a mixture of two types of molecules: 34 N ortho and 14 N para hydrogen, and the partition function of this (metastable) non-equilibrium state is Z = (Zu )3/4 (Zg )1/4 .

(5.1.22)

Then for the speciﬁc heat, we obtain CVrot =

3 rot 1 rot C + C . 4 Vo 4 Vp

(5.1.23)

In Fig. 5.5, the rotational parts of the speciﬁc heat in the metastable state ( 34 ortho and 14 para), as well as for the case of complete equilibrium, are shown. The establishment of equilibrium can be accelerated by using catalysts.

Fig. 5.5. The rotational part of the speciﬁc heat of diatomic molecules such as H2 : equilibrium (solid curve), metastable mixture (dashed)

In deuterium molecules, D2 , the nuclear spin per atom is S = 1,5 which can couple in the molecule to ortho deuterium with a total spin of 2 or 0 and para deuterium with a total spin of 1. The degeneracy of these states is 6 and 3. The associated orbital angular momenta are even (g) and odd (u). The partition function, corresponding to Eq. (5.1.21a-b), is given by Z = 6Zg + 3Zu .

5

QM I, page 187

234 ∗

5. Real Gases, Liquids, and Solutions

5.2 Mixtures of Ideal Molecular Gases

In this section, we investigate the thermodynamic properties of mixtures of molecular gases. The diﬀerent types of particles (elements), of which there are supposed to be n, are enumerated by the index j. Then Nj refers to the N particle number, λj = (2πmjhkT )1/2 is the thermal wavelength, cj = Nj the concentration, ε0el,j the electronic ground state energy, Zj the overall partition function (see (5.1.2)), and Zi,j the partition function for the internal degrees of freedom of the particles of type j. Here, in contrast to (5.1.9), the electronic part is separated out. The total number of particles is N = j Nj . The overall partition function of this non-interacting system is Z=

n

Zj ,

(5.2.1)

j=1

and from it we ﬁnd the free energy V Zi,j + Nj 1 + log ε0el,j Nj . F = −kT 3 N λ j j j j From (5.2.2), we obtain the pressure, P = − P =

∂F ∂V

T,{Nj }

(5.2.2) ,

kT N kT . Nj = V j V

(5.2.3)

The equation of state (5.2.3) is identical to that of the monatomic ideal gas, since the pressure is due to the translational degrees of freedom. For the chemical potential µj of the component j (Sect. 3.9.1), we ﬁnd

∂F V Zi,j = −kT log + ε0el,j ; (5.2.4) µj = ∂Nj T,V Nj λ3j or, if we use the pressure from (5.2.3) instead of the volume, µj = −kT log

kT Zi,j + ε0el,j . cj P λ3j

(5.2.4 )

We now assume that the rotational degrees of freedom are completely unfrozen, but not the vibrational degrees of freedom (Θr T Θv ). Then inserting Zi,j = Zrot,j = Θ2T (see Eq. (5.1.13)) into (5.2.4 ) yields r,j 3/2

mj 7 µj = ε0el,j − kT log kT − kT log 1/2 3/2 3 + kT log cj P . (5.2.5) 2 2 π kΘr,j We have taken the fact that the masses and the characteristic temperatures depend on the type of particle j into consideration here. The pressure enters

∗

5.2 Mixtures of Ideal Molecular Gases

235

the chemical potential of the component j in the combination cj P = Pj (partial pressure). The chemical potential (5.2.5) is a special case of the general form µj = ε0el,j − cP,j T log kT − kT ζj + kT log cj P .

(5.2.5 )

For diatomic molecules in the temperature range mentioned above, cP,j = 7k/2. The ζj are called chemical constants; they enter into the law of mass action (see Chap. 3.9.3). For the entropy, we ﬁnd

∂µj S=− Nj ∂T P,{Ni } j = (cP,j log kT + cP,j + kζj − k log cj P ) Nj , (5.2.6) j

from which one can see that the coeﬃcient cP,j is the speciﬁc heat at constant pressure of the component j. Remarks to Sections 5.1 and 5.2: In the preceding sections, we have described the essential eﬀects of the internal degrees of freedom of molecular gases. We now add some supplementary remarks about additional eﬀects which depend upon the particular atomic structure. (i) We ﬁrst consider monatomic gases. The only internal degrees of freedom are electronic. In the noble gases, the electronic ground state has L = S = 0 and is thus not degenerate. The excited levels lie about 20 eV above the ground state, corresponding to a temperature of 200.000 K higher; in practice, they are therefore not thermally populated, and all the atoms remain in their ground state. One can also say that the electronic degrees of freedom are “frozen out”. The nuclear spin SN leads to a degeneracy factor (2SN + 1). Relative to pointlike classical particles, the partition function contains an additional factor (2SN + 1)e−ε0 /kT , which gives rise to a contribution to the free energy of ε0 − kT log(2SN + 1). This leads to an additional term of k log(2SN + 1) in the entropy, but not to a change in the speciﬁc heat. (ii) The excitation energies of other atoms are not as high as in the case of the noble gases, e.g. 2.1 eV for Na, or 24.000 K, but still, the excited states are not thermally populated. When the electronic shell of the atom has a nonzero S, but still L = 0, this leads together with the nuclear spin to a degeneracy factor of (2S +1)(2SN +1). The free energy then contains the additional term ε0 − kT log((2SN + 1)(2S + 1)) with the consequences discussed above. Here, to be sure, one must consider the magnetic interaction between the nuclear and the electronic moments, which leads to the hyperﬁne interaction. This is e.g. in hydrogen of the order of 6 × 10−6 eV, leading to the well-known 21 cm line. The corresponding characteristic temperature is 0.07 K. The hyperﬁne splitting can therefore be completely neglected in the gas phase. (iii) In the case that both the spin S and the orbital angular momentum L are nonzero, the ground state is (2S + 1)(2L + 1)-fold degenerate; this degeneracy is partially lifted by the spin-orbit coupling. The energy eigenvalues depend on the total angular momentum J, which takes on values between S + L and |S − L|. For example, monatomic halogens in their ground state have S = 12 and L = 1, according to Hund’s ﬁrst two rules. Because of the spin-orbit coupling, in the ground

236

5. Real Gases, Liquids, and Solutions

state J = 32 , and the levels with J = 12 have a higher energy. For e.g. chlorine, the doubly-degenerate 2 P1/2 level lies δε = 0.11 eV above the 4-fold degenerate 2 P3/2 ground state level. This corresponds to a temperature of δε = 1270 K. The k partition function now contains a factor Zel = 4 e−ε0 /kT + 2 e−(ε0 +δε)/kT due to the internal ﬁne-structure degrees of freedom, “ which leads ” to an additional term in δε − kT . This yields the following the free energy of −kT log Zel = ε0 − kT log 4 + 2 e electronic contribution to the speciﬁc heat: ` δε ´2 δε 2 kT e kT el CV = N k “ ”2 . δε 2 e kT + 1 For T δε/k, Zel = 4, only the four lowest levels are populated, and CVel = 0. For T δε/k, Zel = 6, and all six levels are equally occupied, so that CVel = 0. For temperatures between these extremes, CVel passes through a maximum at about δε/k. Both at low and at high temperatures, the ﬁne structure levels express themselves only in the degeneracy factors, but do not contribute to the speciﬁc heat. One should note however that monatomic Cl is present only at very high temperatures, and otherwise bonds to give Cl2 . (iv) In diatomic molecules, in many cases the lowest electronic state is not degenerate and the excited electronic levels are far from ε0 . The internal partition function contains only the factor e−ε0 /kT due to the electrons. There are, however, molecules which have a ﬁnite orbital angular momentum Λ or spin. This is the case in NO, for example. Since the orbital angular momentum has two possible orientations relative to the molecular axis, a factor of 2 in the partition function results. A ﬁnite electronic spin leads to a factor (2S+1). For S = 0 and Λ = 0, there are again ﬁne-structure eﬀects which can be of the right order of magnitude to inﬂuence the thermodynamic properties. The resulting expressions take the same form as those in Remark (iii). A special case is that of the oxygen molecule, O2 . Its ground state 3 Σ has zero orbital angular momentum and spin S = 1; it is thus a triplet without ﬁne structure. The ﬁrst excited level 1 ∆ is doubly degenerate and lies relatively ∧ 11300 K, so that it can be populated at high temperatures. near at δε = 0.97 eV = −ε0 ` −δε ´ These electronic conﬁgurations lead to a factor of e kT 3 + 2 e kT in the partition function, with the consequences discussed in Remark (iii).

5.3 The Virial Expansion 5.3.1 Derivation We now investigate a real gas, in which the particles interact with each other. In this case, the partition function can no longer be exactly calculated. For its evaluation, as a ﬁrst step we will describe the virial expansion, an expansion in terms of the density. The grand partition function ZG can be decomposed into the contributions for 0,1,2, etc. particles ZG = Tr e−(H−µN )/kT = 1+Z(T, V, 1) eµ/kT +Z(T, V, 2) e2µ/kT +. . . , (5.3.1) where ZN ≡ Z(T, V, N ) represents the partition function for N particles.

5.3 The Virial Expansion

237

From it, we obtain the grand potential, making use of the Taylor series expansion of the logarithm 1 Φ = −kT log ZG = −kT Z1 eµ/kT + Z2 − Z12 e2µ/kT + . . . , 2

(5.3.2)

where the logarithm has been expanded in powers of the fugacity z = eµ/kT . Taking the derivatives of (5.3.2) with respect to the chemical potential, we obtain the mean particle number

1 ¯ = − ∂Φ N = Z1 eµ/kT + 2 Z2 − Z12 e2µ/kT + . . . . (5.3.3) ∂µ T,V 2 Eq. (5.3.3) can be solved iteratively for eµ/kT , with the result 2 ¯ ¯ 2 Z2 − 12 Z12 N N eµ/kT = − + ... . Z1 Z1 Z1

(5.3.4)

Eq. (5.3.4) represents a series expansion of eµ/kT in terms of the density, since Z1 ∼ V . Inserting (5.3.4) into Φ has the eﬀect that Φ is given in terms ¯ instead of its natural variables T, V, µ, which is favorable for conof T, V, N structing the equation of state: ¯2 ¯ − Z2 − 1 Z12 N + . . . . Φ = −kT N 2 2 Z1

(5.3.5)

These are the ﬁrst terms of the so called virial expansion. By application of the Gibbs–Duhem relation Φ = −P V , one can go from it directly to the expansion of the equation of state in terms of the particle number density ¯ /V ρ=N P = kT ρ 1 + B(T )ρ + C(T )ρ2 + . . . . (5.3.6) The coeﬃcient of ρn in square brackets is called the (n+1)th virial coeﬃcient. The leading correction to the equation of state of an ideal gas is determined by the second virial coeﬃcient 1 B = − Z2 − Z12 V /Z12 . 2

(5.3.7)

This expression holds both in classical and in quantum mechanics. Note: in the classical limit the integrations over momentum can be carried out, and (5.3.1) is simpliﬁed as follows: ZG (T, V, µ) =

∞ X eβµN Q(T, V, N ) . N !λ3N N=0

(5.3.8)

238

5. Real Gases, Liquids, and Solutions

Here, Q(T, V, N ) Z(T, V, N )

is Z

Q(T, V, N ) = V

Z

the

conﬁgurational

d3N x e−β

part

Z

P i<j

vij

d3N x

=

of Y

the

partition

function

(1 + fij ) =

i<j

V

(5.3.9)

d3N x [1 + (f12 + f13 + . . .) + (f12 f13 + . . .) + . . .]

= V

P P P with fij = e − 1. In this expression, i<j ≡ 12 i j=i refers to the sum over all pairs of particles. One can see from this that the virial expansion represents an expansion in terms of r03 /v, where r0 is the range of the potential. The classical expansion is valid for λ r0 v 1/3 ; see Eqs. (B.39a) and (B.39b) in Appendix B. Equation (5.3.9) can be used as the basis of a systematic graph-theoretical expansion (Ursell and Mayer 1939). −βvij

5.3.2 The Classical Approximation for the Second Virial Coeﬃcient In the case of a classical gas, one ﬁnds for the partition function for N particles P 2 1 3 3 ZN = d p1 . . . d pN d3 x1 . . . d3 xN e( i pi /2m+v(x1 ,...,xN )/kT ) ; 3N N !h (5.3.10a) after integrating over the 3N momenta, this becomes 1 d3 x1 . . . d3 xN e−v(x1 ,...,xN )/kT , ZN = 3N λ N!

(5.3.10b)

where v(x1 , . . . , xN ) is the total potential of the N particles. The integrals over xi are restricted to the volume V . If no external potential is present, and the system is translationally invariant, so that the two-particle interaction depends only upon x1 − x2 , we ﬁnd from (5.3.10b) 1 V d3 x1 e0 = 3 (5.3.11a) Z1 = 3 λ λ and Z2 =

1 2λ6

d3 x1 d3 x2 e−v(x1 −x2 )/kT =

V 2λ6

d3 y e−v(y)/kT . (5.3.11b)

This gives for the second virial coeﬃcient (5.3.7): 1 1 d3 y f (y) = − d3 y e−v(y)/kT − 1 B=− 2 2

(5.3.12)

with f (y) = e−v(y)/kT − 1. To proceed, we now require the two-particle potential v(y), also known as the pair potential. In Fig. 5.6, as an example, the Lennard–Jones potential is shown; it ﬁnds applications in theoretical models for the description of gases and liquids and it is deﬁned in Eq. (5.3.16).

5.3 The Virial Expansion

239

Fig. 5.6. The Lennard–Jones potential as an example of a pair potential v(y), Eq. (5.3.16)

5.3.2.1 A Qualitative Estimate of B(T ) A typical characteristic of realistic potentials is the strong increase for overlapping atomic shells and the attractive interaction at larger distances. A typical shape is shown in Fig. 5.7. Up to the so called ‘hard-core’ radius σ, the potential is inﬁnite, and outside this radius it is weakly negative. Thus the shape of f (r) as shown in Fig. 5.7 results. If we can now assume that in the region of the negative potential, v(x) kT 1, then we ﬁnd for the function in (5.3.12) ⎧ ⎨−1 |x| < σ . f (x) = v(x) ⎩− |x| ≥ σ kT From this, we obtain the second virial coeﬃcient: 1 4π B(T ) ≈ − − σ 3 + 4π 2 3

∞

a , dr r2 (−v(r))/kT = b − kT

(5.3.13)

(5.3.14)

σ

where 2π 3 4π 3 σ =4 r (5.3.15a) 3 3 0 denotes the fourfold molecular volume. For hard spheres of radius r0 , σ = 2r0 and ∞ 1 d3 x v(x)Θ(r − σ) . a = −2π dr r2 v(r) = − (5.3.15b) 2 b=

σ

The result (5.3.14) for B(T ) is drawn in Fig. 5.8. In fact, B(T ) decreases again at higher temperatures, since the potential in Nature, unlike the artiﬁcial case of inﬁnitely hard spheres, is not inﬁnitely high (see Fig. 5.9). Remark: From the experimental determination of the temperature dependence of the virial coeﬃcients, we can gain information about the potential.

240

5. Real Gases, Liquids, and Solutions

Fig. 5.7. A typical pair potential v(r) (solid curve) and the associated f (r) (dashed).

Fig. 5.8. The second virial coeﬃcient from the approximate relation (5.3.14)

Examples: Lennard–Jones potential ((12-6)-potential): σ 12 σ 6 v(r) = 4ε . − r r exp-6-Potential :

a−r σ2 6 v(r) = ε exp − σ1 r

.

(5.3.16)

(5.3.17)

The exp-6-potential is a special case of the so called Buckingham potential, which also contains a term ∝ −r−8 . 5.3.2.2 The Lennard–Jones Potential We will now discuss the second virial coeﬃcient in the case of a Lennard– Jones potential σ 12 σ 6 v(r) = 4ε . − r r It proves expedient to introduce the dimensionless variables r∗ = r/σ and T ∗ = kT /ε. Integrating (5.3.12) by parts yields 4 1 1 12 2π 3 4 6 ∗ ∗2 (5.3.18) σ B(T ) = dr r − ∗ 6 e− T ∗ [ r∗ 12 − r∗ 6 ] . 12 ∗ ∗ 3 T r r Expansion of the factor exp T ∗4r∗ 6 in terms of T ∗4r∗ 6 leads to

∞ 2π 3 2j−3/2 2j − 1 T ∗ −(2j+1)/4 B(T ) = − σ Γ 3 j! 4 j=0 (5.3.19) 2π 3 1.73 2.56 0.87 σ = − − − ... 3 T ∗ 1/4 T ∗ 3/4 T ∗ 5/4

5.3 The Virial Expansion

241

Fig. 5.9. The reduced second virial coeﬃcient B ∗ = 3B/2πLσ 3 for the Lennard– Jones potential. L denotes the Loschmidt number (Avagadro’s number, L = 6.0221367 · 1023 mol−1 ); after Hirschfelder et al.6 and R. J. Lunbeck, Dissertation, Amsterdam 1950

(see Hirschfelder et al.6 Eq. (3.63)); the series converges quickly at large T ∗ . In Fig. 5.9, the reduced second virial coeﬃcient is shown as a function of T ∗ . Remarks: (i) The agreement for the noble gases Ne, Ar, Kr, Xe after adjustment of σ and ε is good. (ii) At T ∗ > 100, the decrease in B(T ) is experimentally somewhat greater than predicted by the Lennard–Jones interaction (i.e. the repulsion is weaker). (iii) An improved ﬁt to the experimental values is obtained with the exp-6potential (5.3.17). (iv) The possibility of representing the second virial coeﬃcients for classical gases in a uniﬁed form by introducing dimensionless quantities is an expression of the so called law of corresponding states (see Sect. 5.4.3).

5.3.3 Quantum Corrections to the Virial Coeﬃcients The quantum-mechanical expression for the second virial coeﬃcient B(T ) is given by (5.3.7), where the partition functions occurring there are to be computed quantum mechanically. The quantum corrections to B(T ) and the 6

T. O. Hirschfelder, Ch. F. Curtiss and R. B. Bird, Molecular Theory of Gases and Liquids, John Wiley and Sons, Inc., New York 1954

242

5. Real Gases, Liquids, and Solutions

other virial coeﬃcients are of two kinds: There are corrections which result from statistics (Bose or Fermi statistics). In addition, there are corrections which arise from the non-commutativity of quantum mechanical observables. The corrections due to statistics are of the order of B=∓

λ3 ∝ 3 25/2

for

bosons , fermions

(5.3.20)

as one can see from Sect. 4.2 or Eq. (B.43). The interaction quantum corrections, according to Eq. (B.46), take the form 2 Bqm = d3 y e−v(y)/kT (∇v(y))2 , (5.3.21) 24m(kT )2 and are thus of the order of 2 . The lowest-order correction given in (5.3.21) results from the non-commutativity of p2 and v(x). We show in Appendix B.33 that the second virial coeﬃcient can be related to the time which the colliding particles spend within their mutual potential. The shorter this time, the more closely the gas obeys the classical equation of state for an ideal gas.

5.4 The Van der Waals Equation of State 5.4.1 Derivation We now turn to the derivation of the equation of state of a classical, real (i.e. interacting) gas. We assume that the interactions of the gas atoms (molecules) consist only of a two-particle potential, which can be decomposed into a hard-core (H.C.) part, vH.C. (y) for |y| ≤ σ, and an attractive part, w(y) (see Fig. 5.7): v(y) = vH.C. (y) + w(y) .

(5.4.1)

The expression “hard core” means that the gas molecules repel each other at short distances like impenetrable hard spheres, which is in fact approximately the case in Nature. Our task is now to determine the partition function, for which after carrying out the integrations over momenta we obtain P 1 Z(T, V, N ) = 3N d3 x1 . . . d3 xN e− i<j v(xi −xj )/kT . (5.4.2) λ N! We still have to compute the conﬁgurational part. This can of course not be carried out exactly, but instead contains some intuitive approximations. Let us ﬁrst ignore the attractive interaction and consider only the hard-core potential. This yields in the partition function for many particles:

5.4 The Van der Waals Equation of State

d3 x1 . . . d3 xN e−

P i<j

vH.C. (xij )/kT

≈ (V − V0 )N .

243

(5.4.3)

This result can be made plausible as follows: if the hard-core radius were zero, σ = 0, then the integration in (5.4.3) would give simply V N ; for a ﬁnite σ, each particle has only V − V0 available, where V0 is the volume occupied by the other N − 1 particles. This is not exact, since the size of the free volume (V − V0 ) depends on the conﬁguration, as can be seen from Fig. 5.10. In (5.4.3), V0 is to be understood the occupied volume for typical conﬁgurations which have a large statistical weight. Then, one can imagine carrying out the integrations in (5.4.3) successively, obtaining a factor V −V0 for each particle. Referring to Fig. 5.10, we can ﬁnd the following bounds for V0 with a particle √ number N : the smallest V0 is obtained for spherical closest packing, V0min = 4 2 r03 N = 5.65 r03 N . The largest V0 is found when the spheres of radius 2r0 do not overlap, i.e. V0max = 8 4π r03 N = 33.51 r03 N . The actual V0 will lie between these extremes 3 and can be determined as below from the comparison with the virial expansion, namely V0 = bN = 4 4π r03 N = 16.75 r03 N . 3

Using (5.4.3), we can cast the partition function (5.4.2) in the form P (V − V0 )N d3 x1 . . . d3 xN e−H.C. e− i<j w(xi −xj )/kT . Z(T, V, N ) = λ3N N ! d3 x1 . . . d3 xN e−H.C. (5.4.4) Here, H.C. stands for the sum of all contributions from the hard-core potential divided / by kT . The second 0 fraction can be interpreted as the average of exp − i<j w(xi − xj )/kT in a gas which experiences only hard-core interactions. Before we treat this in more detail, we want to consider the second exponent more closely. For potentials whose range is much greater than σ and the distance between particles, it follows approximately that the potential acting on j due to the other particles,

Fig. 5.10. Two conﬁgurations of three atoms within the volume V . In the ﬁrst conﬁguration, V0 is larger than in the second. The center of gravity of an additional atom must be located outside the dashed circles. In the second conﬁguration (closer packing), there will be more space for an additional atom (spheres of radius r0 are represented by solid circles, spheres of radius σ = 2r0 by dashed circles)

244

5. Real Gases, Liquids, and Solutions

i =j

w(xi − xj ) ≈ (N − 1)

w(xi − xj ) ≡

i<j

with 1 w ¯= V

d3 x V w(x),

i.e. the sum over all pairs

1 1 1 ¯ ≈ N 2w w(xi − xj ) ≈ N (N − 1)w ¯ (5.4.5a) 2 i 2 2 i =j

d3 x w(x) ≡ −

2a . V

(5.4.5b)

Thus we ﬁnd for the partition function Z(T, V, N ) =

w ¯ (V − V0 )N N 2 a (V − V0 )N − N (N −1) kT 2 e e V kT . = 3N λ N! λ3N N !

(5.4.6)

In this calculation, the attractive part of the potential was replaced by its average value. Here, as in the molecular ﬁeld theory for ferromagnetism which will be treated in the next chapter, we are using an “average-potential approximation”. Before we discuss the thermodynamic consequences of (5.4.6), we return once more to (5.4.4) and the note which followed it. The last factor can be written using a cumulant expansion, Eq. (1.2.16 ), in the form " P = exp − w(xi − xj )/kT e− i<j w(xi −xj )/kT H.C.

+

1 2

w(xi − xj )/kT

i<j

2

i<j

−

H.C.

i<j

H.C.

w(xi − xj )/kT

2

# +. . . .

H.C.

(5.4.7) The average values H.C. are taken with respect to the canonical distribution function of the total hard-core potential. Therefore, i<j w(xi − xj ) H.C. refers to the average of the attractive potential in the “free” volume allowed by the interaction of hard spheres. Under the assumption made earlier that the range is much greater than the hard-core radius σ and the particle distance, we again ﬁnd (5.4.5a,b) and (5.4.6). The second term in the cumulant series (5.4.7) represents the mean square deviation of the attractive interactions. The higher the temperature, the more dominant the term w/kT ¯ becomes. √ From (5.4.6), using N ! N N e−N 2πN , we obtain the free energy, F = −kT N log

e(V − V0 ) N 2 a − , λ3 N V

the pressure (the thermal equation of state),

kT N N 2a ∂F = − 2 , P =− ∂V T,N V − V0 V

(5.4.8)

(5.4.9)

5.4 The Van der Waals Equation of State

and, with E = −T

2

F

∂ ∂T T

V,N

245

, the internal energy (caloric equation of state),

3 N 2a N kT − . (5.4.10) 2 V Finally, we can relate V0 to the second virial coeﬃcient. To do this, we expand (5.4.9) in terms of 1/V and identify the result with the virial expansion (5.3.6) and (5.3.14): kT N V0 aN kT N a N P = 1+ − + ... ≡ 1+ b− + ... . V V kT V V kT V E=

From this, we obtain V0 = N b ,

(5.4.11)

where b is the contribution to the second virial coeﬃcient which results from the repulsive part of the potential. Inserting in (5.4.9), we ﬁnd P =

a kT − , v − b v2

(5.4.12)

where on the right-hand side, the speciﬁc volume v = V /N was introduced. Equation (5.4.12) or equivalently (5.4.9) is the van der Waals equation of state for real gases,7 and (5.4.10) is the associated caloric equation of state. Remarks: (i) The van der Waals equation (5.4.12) has, in comparison to the ideal gas equation P = kT /v, the following properties: the volume v is replaced by v − b, the free volume. For v = b, the pressure would become inﬁnite. This modiﬁcation with respect to the ideal gas is caused by the repulsive part of the potential. (ii) The attractive interaction causes a reduction in the pressure via the term −a/v 2 . This reduction becomes relatively more important as the temperature is lowered. (iii) We make another comparison of the van der Waals equation to the ideal gas equation by writing (5.4.12) in the form “ a” P + 2 (v − b) = kT . v Compared to P v = kT , the speciﬁc volume v has been decreased by b, because the molecules are not pointlike, but instead occupy their own ﬁnite volumes. The mutual attraction of the molecules leads at a given pressure to a reduction of the volume; it thus acts like an additional pressure term. One can also readily understand the proportionality of this term to 1/v 2 . If one considers the surface layer of a liquid, it experiences a kind of attractive force from the deeper-lying layers, which must be proportional to the square of the density, since if the density were increased, the number of molecules in each layer would increase in proportion to the density, and the attractive force per unit area would thus increase proportionally to 1/v 2 . 7

Johannes Dietrich van der Waals, 1837-1923: equation of state formulated 1873, Nobel prize 1910

246

5. Real Gases, Liquids, and Solutions

The combined action of the two terms in the van der Waals equation results in qualitatively diﬀerent shapes for the isotherms at low (T1 , T2 ) and at high (T3 , T4 ) temperatures. The family of van der Waals isotherms is shown in Fig. 5.11. For T > Tc , the isotherms are monotonic, while for T < Tc , they are S-shaped; the signiﬁcance of this will be discussed below.

Fig. 5.11. The van der Waals isotherms in dimensionless units P/Pc and v/vc

We see immediately that on the so called critical isotherm, there is a critical point, at which the ﬁrst and second derivatives vanish, i.e. a horizontal point ∂P ∂2P of inﬂection. The critical point Tc , Pc , Vc thus follows from ∂V = ∂V 2 = 0. kT 2a kT 3a + = 0, − = 0, from This leads to the two conditions − (v−b) 2 v3 (v−b)3 v4 which the values vc = 3b ,

kTc =

8 a , 27 b

Pc =

a 27b2

(5.4.13)

are obtained. The dimensionless ratio 8 kTc = = 2.6˙ Pc vc 3

(5.4.14)

follows from this. The experimental value is found to be somewhat larger. Note: It is apparent even from the derivation that the van der Waals equation can have only approximate validity. This is true of both the reduction of the repulsion eﬀects to an eﬀective molecular volume b, and of the replacement of the attractive (negative) part of the potential by its average value. The latter approximation improves as the range of the interactions increases. In the derivation, correlation eﬀects were neglected, which is questionable especially in the neighborhood of the critical point, where strong density ﬂuctuations will occur (see below). However, the van der Waals equation, in part with empirically modiﬁed van der Waals constants a and b, is able to give a qualitative description of condensation and of the behavior in the neighborhood of the critical point. There are numerous variations on the van der Waals equation; e.g. Clausius suggested the equation

5.4 The Van der Waals Equation of State

247

Fig. 5.12. Isotherms for carbonic acid obtained from Clausius’ equation of state. From M. Planck, Thermodynamik, Veit & Comp, Leipzig, 1897, page 14

P =

c kT . − v−a T (v + b)2

The plot of its isotherms shown in Fig. 5.12 is similar to that obtained from the van der Waals theory.

5.4.2 The Maxwell Construction At temperatures below Tc , the van der Waals isotherms have a typical Sshape (Fig. 5.12). The regions in which (∂P/∂V )T > 0, i.e. the free energy is not convex and therefore the stability criterion (3.6.48b) is not obeyed, are particularly disturbing. The equation of state deﬁnitely requires modiﬁcation in these regions. We now wish to consider the free energy within the van der Waals theory. As we ﬁnally shall see, an inhomogeneous state containing liquid and gaseous phases has a lower free energy. In Fig. 5.13, a van der Waals isotherm and below it the associated free energy f (T, v) = F (T, V )/N are plotted. Although the lower ﬁgure can be directly read oﬀ from Eq. (5.4.8), it is instructive and useful for further discussion to determine the typical shape of the speciﬁc free energy from the isotherms P (T, v) by integration of P = − ∂f over volume: ∂v T

v f (T, v) = f (T, va ) − va

dv P (T, v ) .

(5.4.15)

248

5. Real Gases, Liquids, and Solutions

Fig. 5.13. A van der Waals isotherm and the corresponding free energy in the dimensionless units P/Pc , v/vc and f /kTc . The free energy of the heterogeneous state (dashed) is lower than the van der Waals free energy (solid curve)

The integration is carried out from an arbitrary initial value va of the speciﬁc volume up to v. We now draw in a horizontal line intersecting the van der Waals isotherm in such a way that the two shaded areas are equal. The pressure which corresponds to this line is denoted by P0 . This construction yields the two volume values v1 and v2 . The values of the free energy at the volumes v1,2 will be denoted by f1,2 = f (T, v1,2 ). At the volumes v1 and v2 , the pressure assumes the value P0 and therefore the slope of f (T, v) at these points has the value −P0 . As a reference for the graphical determination of the free energy, we draw a straight line through (v1 , f1 ) with its slope equal to −P0 (shown as a dashed line). If the pressure had the value P0 throughout the whole interval between v1 and v2 , then the free energy would be f1 − P0 (v − v1 ). We can now readily see that the free energy which is shown in Fig. 5.13 follows from P (T, v), since the van der Waals isotherm to the right of v1 initially falls below the horizontal line P = P0 . Thus the negative integral, i.e. the free energy which corresponds to the van der Waals isotherm, lies above the dashed line. Only when the volume v2 has been reached is f2 ≡ f (T, v2 ) = f1 − P0 (v2 − v1 ), owing to the equal areas which were presupposed in drawing the horizontal line, and the two curves meet = − ∂f , the (dashed) line with slope −P0 is again. Due to P0 = − ∂f ∂v v1 ∂v v2 precisely the double tangent to the curve f (T, v). Since P > P0 for v < v1

5.4 The Van der Waals Equation of State

249

and P < P0 for v > v2 , f in these regions also lies above the double tangent. In Fig. 5.13 we can see that the free calculated in the van der Waals energy ∂2f 1 theory is not convex everywhere, 0 > ∂v2 = − ∂P ∂v = κT ; this violates the thermodynamic inequality (3.3.5). For comparison, we next consider a two-phase, heterogeneous system, −v whose entire material content is divided into a fraction c1 = vv22−v in the 1 v−v1 state (v1 , T ) and a fraction c2 = v2 −v1 in the state (v2 , T ). These states have the same pressure and temperature and can exist in mutual equilibrium. Since the free energy of this inhomogeneous state is given by the linear combination c1 f1 + c2 f2 of f1 and f2 , it lies on the dashed line.8 Thus, the free energy of this inhomogeneous state is lower than that from the van der Waals theory. In the interval [v1 , v2 ] (two-phase region), the substance divides into two phases, the liquid phase with temperature and volume (T, v1 ), and the gas phase with (T, v2 ). The pressure in this interval is P0 . The real isotherm is obtained from the van der Waals isotherm by replacing the S-shaped portion by the horizontal line at P = P0 , which divides the area equally. Outside the interval [v1 , v2 ], the van der Waals isotherm is unchanged. This construction of the equation of state from the van der Waals theory is called the Maxwell construction. The values of v1 and v2 depend on the temperature of the isotherm considered, i.e. v1 = v1 (T ) and v2 = v2 (T ). As T approaches Tc , the interval [v1 (T ), v2 (T )] becomes smaller and smaller; as the temperature decreases below Tc , the interval becomes larger. Correspondingly, the pressure P0 (T ) increases or decreases. In Fig. 5.14, the Maxwell construction for a family of van der Waals isotherms is shown. The points (P0 (T ), v1 (T )) and

Fig. 5.14. Van der Waals isotherms, showing the Maxwell construction and the resulting coexistence curve (heavy curve) in the dimensionless units P/Pc and v/vc , as well as the free energy f 8

c1 + c2 = 1 , v1 c1 + v2 c2 = v , c1 f1 + c2 f2 = c1 f1 + c2 (f1 − P0 (v2 − v1 )) = f1 − P0 (v − v1 ).

250

5. Real Gases, Liquids, and Solutions

(P0 (T ), v2 (T )) form the liquid branch and the gas branch of the coexistence curve (heavy curves in Fig. 5.14). The region within the coexistence curve is called the coexistence region or two-phase region. In this region the isotherms are horizontal, the state is heterogeneous, and it consists of both the liquid and gaseous phases from the two limiting points of the coexistence region. Remarks: (i) In Fig. 5.15, the P V T -surface which follows from the Maxwell construction is shown schematically. The van der Waals equation of state and the conclusions which can be drawn from it are in accord with the general considerations concerning the liquid-gas phase transition in the framework of thermodynamics which we gave in Sect. 3.8.1.

Fig. 5.15. The surface of the equation of state from the van der Waals theory with the Maxwell equal-area construction (schematic). Along with three isotherms at temperatures T1 < Tc < T2 , the coexistence curve (surface) and its projection on the T -V plane are shown

(ii) The chemical potentials µ = f + P v of the two coexisting liquid and gaseous phases are equal. (iii) Kac, Uhlenbeck and Hemmer9 calculated the partition function exactly for a one-dimensional model with an inﬁnite-range potential ( ∞ |x| < x0 v(x) = and κ → 0 . −κe−κ|x| |x| > x0 The result is an equation of state which is qualitatively the same as in the van der Waals theory. In the coexistence region, instead of the S-shaped curve, horizontal isotherms are found immediately. (iv) A derivation of the van der Waals equation for long-range potentials akin to L. S. Ornstein’s, in which the volume is divided up into cells and the most probable occupation number in each cell is calculated, was given by van Kampen10 . The homogeneous and heterogeneous stable states were found. Within the coexistence region, the heterogeneous states – which are described by the horizontal line in the Maxwell construction – are absolutely stable. The two homogeneous states, represented by the S-shaped van der Waals isotherms, are < 0, and describe the superheated liquid and the metastable, as long as ∂P ∂v supercooled vapor. 9 10

M. Kac, G. E. Uhlenbeck and P. C. Hemmer, J. Math. Phys. 4, 216 (1963) N. G. van Kampen, Phys. Rev. 135, A362 (1964)

5.4 The Van der Waals Equation of State

251

5.4.3 The Law of Corresponding States a If one divides the van der Waals equation by Pc = 27b 2 and uses the reduced P v T variables P ∗ = Pc , V ∗ = vc , T ∗ = Tc , then a dimensionless form of the equation is obtained:

P∗ =

3 8T ∗ − . 3V ∗ − 1 V ∗ 2

(5.4.16)

In these units, the equation of state is the same for all substances. Substances with the same P ∗ , V ∗ and thus T ∗ are in corresponding states. Eq. (5.4.16) is called the “law of corresponding states”; it can also be cast in the form P ∗V ∗ = T∗ 3−

8 ·

P∗ T∗

T∗ P ∗V ∗

−

3P ∗ T ∗ . T ∗2 P ∗ V ∗

This means that P ∗ V ∗ /T ∗ as a function of P ∗ yields a family of curves with the parameter T ∗ . All the data from a variety of liquids at ﬁxed T ∗ lie on a single curve (Fig. 5.16). This holds even beyond the validity range of the van der Waals equation. Experiments show that liquids behave similarly when P, V and T are measured in units of Pc , Vc and Tc . This is illustrated for a series of diﬀerent substances in Fig. 5.16.

Fig. 5.16. The law of corresponding states.11

5.4.4 The Vicinity of the Critical Point We now want to discuss the van der Waals equation in the vicinity of its critical point. To do this, we write the results in a form which makes the 11

G. J. Su, Ind. Engng. Chem. analyt. Edn. 38, 803 (1946)

252

5. Real Gases, Liquids, and Solutions

analogy to other phase transitions transparent. The usefulness of this form will become completely clear in connection with the treatment of ferromagnets in the next chapter. The equation of state in the neighborhood of the critical point can be obtained by introducing the variables ∆P = P − Pc ,

∆v = v − vc ,

∆T = T − Tc

(5.4.17)

and expanding the van der Waals equation (5.4.12) in terms of ∆v and ∆T : k(Tc + ∆T ) a − 2b + ∆v (3b + ∆v)2 „ « ∆v “ ∆v ”2 “ ∆v ”3 “ ∆v ”4 k(Tc + ∆T ) 1− = − + ∓ ... + 2b 2b 2b 2b 2b „ « “ “ “ ” ” ” 2 3 ∆v ∆v 4 ∆v ∆v a −4 +5 ∓ ... . − 2 1−2 +3 9b 3b 3b 3b 3b

P =

From this expansion, we ﬁnd the equation of state in the immediate neighborhood of its critical point12 3 (∆v ∗ )3 + . . . ; (5.4.18) 2 it is in this approximation antisymmetric with respect to ∆v ∗ , see Fig. 5.17. ∆P ∗ = 4 ∆T ∗ − 6 ∆T ∗ ∆v ∗ −

Fig. 5.17. The coexistence curve in the vicinity of the critical point. Due to the term 4 ∆T ∗ in the equation of state (5.4.18), the coexistence region is inclined with respect to the V -T plane. The isotherm shown is already so far from the critical point that it is no longer strictly antisymmetric 12

The term ∆T (∆v)2 and especially higher-order terms can be neglected in the leading calculation of the coexistence curve, since it is eﬀectively of order (∆T )2 in comparison to ∼ (∆T )3/2 for the terms which were taken into account. The corrections to the leading critical behavior will be summarized at the end of this section. In Eq. (5.4.18), for clarity we use the reduced variables deﬁned just before Eq. (5.4.16): ∆P ∗ = ∆P/Pc etc.

5.4 The Van der Waals Equation of State

253

The Vapor-Pressure Curve: We obtain the vapor-pressure curve by projecting the coexistence region onto the P -T plane. Owing to the antisymmetry of the van der Waals isotherms with respect to ∆v ∗ in the neighborhood of Tc (Eq. 5.4.18), we can easily determine the location of the two-phase region by setting ∆v ∗ = 0 (cf. Fig. 5.17), ∆P ∗ = 4 ∆T ∗ .

(5.4.19)

The Coexistence Curve: The coexistence curve is the projection of the coexistence region onto the V -T plane. Inserting (5.4.19) into (5.4.18), we obtain the equation 0 = 6 ∆T ∗ ∆v ∗ + 3/2 (∆v ∗ )3 with the solutions ∗ ∆vG = −∆vL∗ = 4(−∆T ∗ ) + O(∆T ∗ ) (5.4.20) for T < Tc . For T < Tc , the substance can no longer occur with a single density, but instead splits up into a less dense gaseous phase and a denser ∗ liquid phase (cf. Sect. 3.8). ∆vG and ∆vL∗ represent the two values of the order parameter for this phase transition (see Chap. 7). The Speciﬁc Heat: T > Tc : From Eq. (5.4.10), the internal energy is found to be E = 32 N kT − aN 2 V . Therefore, the speciﬁc heat at constant volume outside the coexistence region is CV =

3 Nk , 2

(5.4.21a)

as for an ideal gas. We now imagine that we can cool a system with precisely the critical density. Above Tc it has the homogeneous density 1/vc , while L below Tc , it divides into the two fractions (as in (5.4.20)) cG = vvGc −v −vL and vG −vc cL = vG −vL with a gaseous phase and a liquid phase. T < Tc : below Tc , the internal energy is given by

E 3 cG vc + ∆vG + ∆vL 3 cL = kT − a . (5.4.21b) = kT − a + N 2 vG vL 2 (vc + ∆vG )(vc + ∆vL ) If we insert (5.4.20), or, anticipating later results, (5.4.29),13 we obtain

a 9 56 a 3 kT − + k(T − Tc ) + E=N 2 vc 2 25 vc 13

T − Tc Tc

2

+ O (∆T )

5/2

.

With (5.4.20), one ﬁnds only the jump in the speciﬁc heat; in order to determine the linear term in (5.4.21b) as well, one must continue the expansion of vG and ∆vL , Eq. (5.4.27). Including these higher terms, the coexistence curve is not symmetric.

254

5. Real Gases, Liquids, and Solutions

Fig. 5.18. The speciﬁc heat in the neighborhood of the critical point of the van der Waals liquid

The speciﬁc heat

9 28 T − Tc 3 CV = N k + N k 1 + + ... 2 2 25 Tc

for T < Tc

(5.4.21c)

exhibits a discontinuity (see Fig. 5.18). The Critical Isotherm: In order to determine the critical isotherm, we set ∆T ∗ = 0 in (5.4.18). The critical isotherm 3 ∆P ∗ = − (∆v ∗ )3 2

(5.4.22)

is a parabola of third order; it passes through the critical point horizontally, which implies divergence of the isothermal compressibility. The Compressibility: To calculate the isothermal compressibility κT = − V1 ∂V ∂P T , we determine

N

∂P ∗ ∂V ∗

T

9 = −6 ∆T ∗ − (∆v ∗ )2 2

(5.4.23)

from the van der Waals equation (5.4.18). For T > Tc , we ﬁnd along the critical isochores (∆v ∗ = 0) κT =

1 1 Tc 1 . = ∗ 6Pc ∆T 6Pc ∆T

(5.4.24a)

∗ For T < Tc , along the coexistence curve (i.e. ∆v ∗ = ∆vG = −∆vL∗ ), using ∗ ∂P ∗ 2 = −6 ∆T ∗ − 92 (∆vG ) = 24 ∆T ∗, Eq. (5.4.20), we obtain the result N ∂V ∗ T that is

κT =

1 Tc . 12Pc (−∆T )

(5.4.24b)

5.4 The Van der Waals Equation of State

255

The isothermal compressibility diverges in the van der Waals theory above and below the critical temperature as (T − Tc )−1 . The accompanying longrange density ﬂuctuations lead to an increase in light scattering in the forward direction (critical opalescence; see (9.4.51)). Summary: Comparison with experiments shows that liquids in the neighborhood of their critical points exhibit singular behavior, similar to the results described above. The coexistence line obeys a power law; however the exponent is not 1/2, but instead β ≈ 0.326; the speciﬁc heat is in fact divergent, and is characterized by a critical exponent α. The critical isotherm obeys ∆P ∼ ∆v δ and the isothermal compressibility is κT ∼ |T − Tc |−γ . Table 5.2 contains a summary of the results of the van der Waals theory and the power laws which are in general observed in Nature. The exponents β, α, δ, and γ are called critical exponents. The speciﬁc heat shows a discontinuity according to the van der Waals theory, as shown in Fig. 5.18. It is thus of the order of (T − Tc )0 just to the left and to the right of the transition. The index d of the exponent 0 in Table 5.2 refers to this discontinuity. Compare Eq. (7.1.1). Table 5.2. Critical Behavior according to the van der Waals Theory Physical quantity

van der Waals

∆vG = −∆vL cV ∆P κT

∼ (Tc − T ) 2 ∼ (T − Tc )0d ∼ (∆v)3 ∼ |T − Tc |−1

1

Critical behavior (Tc − T )β |Tc − T |−α (∆v)δ |T − Tc |−γ

Temperature range T T T T

< Tc ≷ Tc = Tc ≷ Tc

The Latent Heat Finally, we will determine the latent heat just below the critical temperature. The latent heat can be written using the Clausius– Clapeyron equation (3.8.8) in the form: q = T (sG − sL ) = T

∂P0 ∂P0 (vG − vL ) = T (∆vG − ∆vL ) . ∂T ∂T

Here, sG and sL refer to the entropies per particle of the gas and liquid 0 phases and ∂P ∂T is the slope of the vaporization curve at the corresponding 0 point. In the vicinity of the critical point, to leading order we can set T ∂P ∂T ≈ ∂P0 Tc ∂T c.p. , where (∂P0 /∂T )c.p. is the slope of the evaporation curve at the critical point.

∂P q = 2Tc ∆vG . (5.4.25) ∂T c.p.

256

5. Real Gases, Liquids, and Solutions

The slope of the vapor-pressure curve at Tc is ﬁnite (cf. Fig. 5.17 and Eq. (5.4.19)). Thus the latent heat decreases on approaching Tc according to the same power law as the order parameter, i.e. q ∝ (Tc − T )β ; in the van der Waals theory, β = 12 . 2 1 By means of the thermodynamic relation (3.2.24) CP −CV = −T ∂P ∂T V ∂P the critical behavior of the speciﬁc heat at ∂V T , we can also determine constant pressure. Since ∂P ∂T V is ﬁnite, the right-hand side behaves like the isothermal compressibility κT , and because CV is only discontinuous or at most weakly singular, it follows in general that CP ∼ κT ∝ (T − Tc )−γ ;

(5.4.26)

for a van der Waals liquid, γ = 1. ∗

Higher-Order Corrections to Eq. (5.4.18) For clarity, we use the reduced quantities deﬁned in (5.4.16). Then the van der Waals equation becomes ”` ` ´2 “ 3 ´3 27 ∆P ∗ = 4∆T ∗ − 6∆T ∗ ∆v ∗ + 9∆T ∗ ∆v ∗ − + ∆T ∗ ∆v ∗ 2 2 “ 21 “` ”` ”` ´4 “ 99 ´5 ´6 ” 81 243 + + ∆T ∗ ∆v ∗ + + ∆T ∗ ∆v ∗ + O ∆v ∗ . 4 4 8 8 (5.4.27) ∗ and the vapor-pressure curve, which we denote here The coexistence curve ∆vG/L ∗ ∗ by ∆P0 (∆T ), are found from the van der Waals equation: ` ` ´ ∗´ ∆P ∗ ∆T ∗ , ∆vG = ∆P ∗ ∆T ∗ , ∆vL∗ = 0

with the Maxwell construction ∗ ∆vG

Z

` ´ ´` d ∆v ∗ ∆P ∗ − ∆P0∗ (∆T ∗ ) = 0 .

∗ ∆vL

For the vapor-pressure curve in the van der Waals theory, we obtain “` ´2 ´5/2 ” 24 ` −∆T ∗ + O −∆T ∗ , ∆P0∗ = 4∆T ∗ + 5 and for the coexistence curve: √ ´ ` ´3/2 ´ ` 18 ` ∗ −∆T ∗ + X −∆T ∗ = 2 −∆T ∗ + + O (∆T ∗ )2 ∆vG 5 √ ` ´ ` ´3/2 ´ 18 ` −∆T ∗ + Y −∆T ∗ ∆vL∗ = −2 −∆T ∗ + + O (∆T ∗ )2 5

(5.4.28)

(5.4.29)

, see problem 5.6). In contrast to the ferromagnetic phase (with X − Y = 294 25 transition, the order parameter is not exactly symmetric; instead, it is symmetric only near Tc , compare Eq. (5.4.20).

5.5 Dilute Solutions The internal energy is: „ “ ”« ´2 56 ` 3 a E 5/2 1 − 4∆T ∗ − = kT − ∆T ∗ + O |∆T ∗ | N 2 vc 25 and the heat capacity is: „ “ ”« 3 9 28 3/2 CV = N k + N k 1 − . |∆T ∗ | + O |∆T ∗ | 2 2 25

257

(5.4.30)

(5.4.31)

For the calculation of the speciﬁc heat, only the diﬀerence X − Y = 294/25 enters. The vapor-pressure curve is no longer linear in ∆T ∗ , and the coexistence curve is no longer symmetric with respect to the critical volume.

5.5 Dilute Solutions 5.5.1 The Partition Function and the Chemical Potentials We consider a solution where the solvent consists of N particles and the solute of N atoms (molecules), so that the concentration is given by c=

N 1. N

We shall calculate the properties of such a solution by employing the grand partition function14 ZG (T, V, µ, µ ) =

∞

Zn (T, V, µ)z

n

n =0

2 = Z0 (T, V, µ) + z Z1 (T, V, µ) + O z .

(5.5.1)

It depends upon the chemical potentials of the solvent, µ, and of the solute, µ . Since the solute is present only at a very low concentration, we have µ 0 and therefore the fugacity z = eµ /kT 1. In (5.5.1), Z0 (T, V, µ) means the grand partition function of the pure solvent and Z1 (T, V, µ) that of the solvent and a dissolved molecule. From these expressions we ﬁnd for the total pressure −P =

2 Φ kT =− log ZG = ϕ0 (T, µ) + z ϕ1 (T, µ) + O z , V V

(5.5.2)

kT Z1 where ϕ0 = − kT V log Z0 and ϕ1 = − V Z0 . In (5.5.2), ϕ0 (T, µ) is the contribution of the pure solvent and the second term is the correction due to 14

P −β(Hn +Hn +Wn n −µn) , where Tr Here, Zn (T, V, µ) = ∞ n and Trn n=0 Trn Trn e refer to the traces over n- and n -particle states of the solvent and the solute, respectively. The Hamiltonians of these subsystems and their interactions are denoted by Hn , Hn and Wn n .

258

5. Real Gases, Liquids, and Solutions

the dissolved solute. Here, Z1 and therefore ϕ1 depend on the interactions of the dissolved molecules with the solvent, but not however on the mutual interactions of the dissolved molecules. We shall now express the chemical potential µ in terms of the pressure. To this end, we use the inverse function −1 ϕ−1 0 at ﬁxed T , i.e. ϕ0 (T, ϕ0 (T, µ)) = µ, obtaining µ = ϕ−1 0 (T, −P − z ϕ1 (T, µ)) = ϕ−1 0 (T, −P ) − z

2 ϕ1 (T, ϕ−1 (T, −P )) 0 + O z . ∂ϕ0 −1

(5.5.3)

∂µ µ=ϕ0 (T,−P )

The (mean) particle numbers are ∂ϕ0 (T, µ) ∂Φ = −V + O(z ) ∂µ ∂µ 2 ∂Φ z V ϕ1 (T, µ) + O z . N = − = − ∂µ kT N =−

(5.5.4a) (5.5.4b)

Inserting this into (5.5.3), we ﬁnally obtain µ(T, P, c) = µ0 (T, P ) − kT c + O(c2 ) ,

(5.5.5)

where µ0 (T, P ) ≡ ϕ−1 0 (T, −P ) is the chemical potential of the pure solvent as a function of T and P . From (5.5.4b) and (5.5.4a), we ﬁnd for the chemical potential of the solute:

2 −N kT µ = kT log z = kT log + O z V ϕ1 (T, µ) (5.5.6) (T,µ) N kT ∂ϕ0∂µ + O(z ) ; = kT log N ϕ1 (T, µ) and ﬁnally, using (5.5.5), µ (T, P, c) = kT log c + g(T, P ) + O(c) .

(5.5.7)

In the function g(T, P ) = kT log(kT /υ0 (T, P )ϕ1 (T, µ0 (T, P ))), which depends only on the thermodynamic variables T and P , the interactions of the dissolved molecules with the solvent also enter. The simple dependences of the chemical potentials on the concentration are valid so long as one chooses T and P as independent variables. From (5.5.5), we can calculate the pressure as a function of T and µ. To do this, we use P0 (T, µ), the inverse function of µ0 (T, P ), and rewrite (5.5.5) as follows: µ = µ0 (T, P0 (T, µ) + (P − P0 (T, µ))) − kT c ; we then expand in terms of P −P0 (T, µ) and use the fact that µ0 (T, P0 (T, µ)) = µ holds for the pure solvent:

5.5 Dilute Solutions

µ= µ+

∂µ0 ∂P

259

(P − P0 (T, µ)) − kT c . T

From the Gibbs-Duhem relation, we know that

∂µ0 ∂P

T

= v0 (P, T ) = v +

O(c ), from which it follows that c (5.5.8) P = P0 (T, µ) + kT + O(c2 ) , v where v is the speciﬁc volume of the solvent. The interactions of the dissolved atoms with the solvent do not enter into P (T, µ, c) and µ(T, P, c) to the order we are considering, although we have not made any constraining assumptions about the nature of the interactions. 2

∗ An Alternate Derivation of (5.5.6) and (5.5.7) in the Canonical Ensemble We again consider a system with two types of particles which are present in the amounts (particle numbers) N and N , where the concentration of the latter type, c = NN 1, is very small. The mutual interactions of the dissolved atoms can be neglected in dilute solutions. The interaction of the solvent with the solute is denoted by WN N . Furthermore, the solute is treated classically. We initially make no assumptions regarding the solvent; in particular, it can be in any phase (solid, liquid, gaseous). The partition function of the overall system then takes the form Z dΓN e−(HN +WN N )/kT Z = Tr e−HN /kT N ! h3N ﬁZ ﬂ (5.5.9a) “ ” 1 3 3 −(VN +WN N )/kT e d , = Tr e−HN /kT x . . . d x 1 N N ! λ 3N where λ is the thermal wavelength of the dissolved substance. HN and HN are the Hamiltonians for the solvent and the solute molecules, VN denotes the interactions of the solute molecules, and WN N is the interaction of the solvent with the solute. A conﬁgurational contribution also enters into (5.5.9a): Z D E Zconf = d3 x1 . . . d3 xN e−(VN +WN N )/kT (5.5.9b) R 3 d x1 . . . d3 xN Tr e−HN /kT e−(VN +WN N )/kT . ≡ Tr e−HN /kT The trace runs over all the degrees of freedom of the solvent. When the latter must be treated quantum-mechanically, WN N also contains an additional contribution due to the nonvanishing commutator of HN and the interactions. VN depends on the {x } and WN N on the {x } and {x} (coordinates of the solute molecules and the solvent). We assume that the interactions are short-ranged; then VN can be neglected for all the typical conﬁgurations of the dissolved solute molecules: E D E D e−(VN +WN N )/kT ≈ e−WN N /kT D

E

2 2 2 (WN − W /kT + 1 N −WN N ) /(kT ) ±... 2 = e N N ´ D E P ` Wn N 2 2 1 − N − 2 (Wn N −Wn N ) ±... kT n =0 2(kT ) =e

= e−N

ψ(T,V /N)

.

(5.5.9c)

260

5. Real Gases, Liquids, and Solutions

Here, Wn N denotes the interaction of molecule n with the N molecules of the solvent. In Eq. (5.5.9c), a cumulant expansion was carried out and we have taken into account that the overlap of the interactions of diﬀerent molecules vanishes for all of the typical conﬁgurations. Owing to translational invariance, the expectation values Wn N etc. are furthermore independent of x and are the same for all n . We thus ﬁnd for each of the dissolved molecules a factor e−ψ(T,V /N) , where ψ depends on the temperature and the speciﬁc volume of the solvent. It follows from (5.5.9c) that the partition function (5.5.9a) is «N ” 1 „V “ ψ(T, V /N ) . Z = Tr e−HN /kT N ! λ 3

(5.5.10)

This result has the following physical meaning: the dissolved molecules behave like an ideal gas. They are subject at every point to the same potential from the surrounding solvent atoms, i.e. they are moving in a position-independent eﬀective potential kT ψ(T, V /N ), whose value depends on the interactions, the temperature, and the density. The free energy therefore assumes the form F (T, V, N, N ) = F0 (T, V, N ) − kT N log

eV − N γ(T, V /N ) , N λ 3

(5.5.11)

where F0 (T, V ) = −kT log Tr e−HN /kT is the free energy of the pure solvent and γ(T, V /N ) = kT log ψ(T, V /N ) is due to the interactions of the dissolved atoms with the solvent. From (5.5.11), we ﬁnd for the pressure « « „ „ ∂ kT N ∂F + N γ = P0 (T, V /N ) + P =− ∂V T,N,N V ∂V T,N (5.5.12) „ « kT c ∂ = P0 (T, v) + , +c γ(T, v) v ∂v T

V were employed. where c = NN and v = N We could calculate the chemical potentials from (5.5.11) as functions of T and v. In practice, however, one is usually dealing with physical conditions which ﬁx the pressure instead of the speciﬁc volume. In order to obtain the chemical potentials as functions of the pressure, it is expedient to use the free enthalpy (Gibbs free energy). It is found from (5.5.11) and (5.5.12) to be « „ „ « eV ∂γ , (5.5.13) G = F +P V = G0 (T, P, N )−kT N log 3 − 1 −N γ − V ∂V N λ

where P0 (T, v) and G0 (T, P, N ) are the corresponding quantities for the pure solvent. From Equation (5.5.12), one can compute v as a function of P, T and c, ` ´ v = v0 (T, P ) + O N /N . If we insert this in (5.5.13), we ﬁnd an expression for the free enthalpy of the form „ 2 « ” “ N N , (5.5.14) G(T, P, N, N ) = G0 (T, P, N )−kT N log −1 +N g(T, P )+O N N ˛ “ ´”˛ ` ∂γ v where g(T, P ) = −kT log λ3 − γ − V ∂V ˛˛ . Now we can compute the v=v0 (T,P )

two chemical potentials ` ∂Gas´ functions of T, P and c. For the chemical potential of the , the result to leading order in the concentration is solvent, µ(T, P, c) = ∂N T,P,N

5.5 Dilute Solutions ` ´ µ(T, P, c) = µ0 (T, P ) − kT c + O c2 . For the chemical potential of the solute, we ﬁnd from (5.5.14) „ « ` ´ ∂G 1 = −kT log + g(T, P ) + O c . µ (T, P, c) = ∂N N,P,T c

261 (5.5.15)

(5.5.16)

The results (5.5.15) and (5.5.16) agree with those found in the framework of the grand canonical ensemble (5.5.5) and (5.5.7).

5.5.2 Osmotic Pressure We let two solutions of the same substances (e.g. salt in water) be separated by a semipermeable membrane (Fig. 5.19). An example of a semipermeable membrane is a cell membrane.

Fig. 5.19. A membrane which allows only the solvent to pass through (= semipermeable) separates the two solutions. · = solvent, • = solute; concentrations c1 and c2

The semipermeable membrane allows only the solvent to pass through. Therefore, in chambers 1 and 2, there will be diﬀerent concentrations c1 and c2 . In equilibrium, the chemical potentials of the solvent on both sides of the membrane are equal, but not those of the solute. The osmotic pressure is deﬁned by the pressure diﬀerence ∆P = P1 − P2 . From (5.5.8), we can calculate the pressure on both sides of the membrane, and since in equilibrium, the chemical potentials of the solvent are equal, µ1 = µ2 , it follows that the pressure diﬀerence is ∆P =

c 1 − c2 kT . v

(5.5.17)

The van’t Hoﬀ formula is obtained as a special case for c2 = 0, c1 = c, when only the pure solvent is present on one side of the membrane: ∆P =

N c kT = kT . v V

(5.5.17 )

Here, N refers to the number of dissolved molecules in chamber 1 and V to its volume.

262

5. Real Gases, Liquids, and Solutions

Notes: (i) Equation (5.5.17 ) holds for small concentrations independently of the nature of the solvent and the solute. We point out the formal similarity between the van’t Hoﬀ formula (5.5.17) and the ideal gas equation. The osmotic pressure of a dilute solution of n moles of the dissolved substance is equal to the pressure that n moles of an ideal gas would exert on the walls of the overall volume V of solution and solvent. (ii) One can gain a physical understanding of the origin of the osmotic pressure as follows: the concentrated part of the solution has a tendency to expand into the less concentrated region, and thus to equalize the concentrations. (iii) For an aqueous solution of concentration c = 0.01, the osmotic pressure at room temperature amounts to ∆P = 13.3 bar. ∗

5.5.3 Solutions of Hydrogen in Metals (Nb, Pd,...)

We now apply the results of Sect. 5.5.1 to an important practical example, the solution of hydrogen in metals such as Nb, Pd,. . . (Fig. 5.20). In the gas phase, hydrogen occurs in molecular form as H2 , while in metals, it dissociates. We thus have a case of chemical equilibrium, see Sect. 3.9.3.

Fig. 5.20. Solution of hydrogen in metals: atomic hydrogen in a metal is represented by a dot, while molecular hydrogen in the surrounding gas phase is represented by a pair of dots.

The chemical potential of molecular hydrogen gas is kT V = −kT log + log Z + log Zi µH2 = −kT log i N λ3H2 P λ3H2

, (5.5.18)

where Zi also contains the electronic contribution to the partition function (Eq. (5.1.5c)). The chemical potential of atomic hydrogen dissolved in a metal is, according to Eq. (5.5.7), given by µH = kT log c + g(T, P ) .

(5.5.19)

The metals mentioned can be used for hydrogen storage. The condition for chemical equilibrium (3.9.26) is in this case 2µH = µH2 ; this yields the equilibrium concentration:

5.5 Dilute Solutions

c = e(µH2 /2−g(T,P ))/kT =

P λ3H2 kT

1 2

− 12

Zi

exp

−2g(T, P ) + εel 2kT

263

. (5.5.20)

Since g(T, P ) depends only weakly on P , the concentration of undissolved 1 hydrogen is c ∼ P 2 . This dependence is known as Sievert’s law . 5.5.4 Freezing-Point Depression, Boiling-Point Elevation, and Vapor-Pressure Reduction Before we turn to a quantitative treatment of freezing-point depression, boiling-point elevation, and vapor-pressure reduction, we begin with a qualitative discussion of these phenomena. The free enthalpy of the liquid phase of a solution is lowered, according to Eq. (5.5.5), relative to its value in the pure solvent, an eﬀect which can be interpreted in terms of an increase in entropy. The free enthalpies of the solid and gaseous phases remain unchanged. In Fig. 5.21, G(T, P ) is shown qualitatively as a function of the temperature and the pressure, keeping in mind its convexity, and assuming that the dissolved substance is soluble only in the liquid phase. The solid curve describes the pure solvent, while the change due to the dissolved substance is described by the chain curve. As a rule, the concentration of the solute in the liquid phase is largest and the associated entropy increase leads to a reduction of the free enthalpy. From these two diagrams, the depression of the freezing point, the elevation of the boiling point, and the reduction in the vapor pressure can be read oﬀ.

Fig. 5.21. The change in the free enthalpy on solution of a substance which dissolves to a notable extent only in the liquid phase. The solid curve is for the pure solvent, the chain curve for the solution. We can recognize the freezing-point depression, the boiling-point elevation, and the vapor-pressure reduction

Next we turn to the analytic treatment of these phenomena. We ﬁrst consider the melting process. The concentrations of the dissolved substance

264

5. Real Gases, Liquids, and Solutions

in the liquid and solid phases are cL and cS .15 The chemical potentials of the solvent in the liquid and the solid phase are denoted by µL and µS , and correspondingly in the pure system by µL0 and µS0 . From Eq. (5.5.5), we ﬁnd that µL = µL0 (P, T ) − kT cL and µS = µS0 (P, T ) − kT cS . In equilibrium, the chemical potentials of the solvent must be equal, µL = µS , from which it follows that16 µL0 (P, T ) − kT cL = µS0 (P, T ) − kT cS .

(5.5.21)

For the pure solute, we obtain the melting curve, i.e. the relation between the melting pressure P0 and the melting temperature T0 , from µL0 (P0 , T0 ) = µS0 (P0 , T0 ) .

(5.5.22)

Let (P0 , T0 ) be a point on the melting curve of the pure solvent. Then consider a point (P, T ) on the melting curve which obeys (5.5.21), and which is shifted relative to (P0 , T0 ) by ∆P and ∆T , that is P = P0 + ∆P ,

T = T0 + ∆T .

If we expand Eq. (5.5.21) in terms of ∆P and ∆T , and use (5.5.22), we ﬁnd the following relation ∂µL0 ∂µL0 ∂µS0 ∂µS0 ∆P + ∆T − kT cL = ∆P + ∆T − kT cS . (5.5.23) ∂P 0 ∂T 0 ∂P 0 ∂T 0 We now recall that G = µN = E − T S + P V , and using it we obtain

∂µ ∂P

dG = −SdT + V dP + µdN = d(µN ) = µdN + N dµ ,

∂µ V S = v, = = − = −s . N ∂T P,N N T,N

The derivatives in (5.5.23) can therefore be expressed in terms of the volumes per molecule vL and vS , and the entropies per molecule sL and sS in the liquid and solid phases of the pure solvent, 15

16

Since two phases and two components are present, the number of degrees of freedom is two (Gibbs’ phase rule). One can for example ﬁx the temperature and one concentration; then the other concentration and the pressure are determined. The chemical potentials of the solute must of course also be equal. From this fact, we can for example express the concentration in the solid phase, cS , in terms of T and cL . We shall, however, not need the exact value of cS , since cS cL is negligible.

5.5 Dilute Solutions

−(sS − sL )∆T + (vS − vL )∆P = (cS − cL )kT .

265

(5.5.24)

Finally, we introduce the heat of melting q = T (sL − sS ), thus obtaining q ∆T + (vS − vL )∆P = (cS − cL )kT . T

(5.5.25)

The change in the transition temperature ∆T at a given pressure is obtained from (5.5.25), by setting P = P0 or ∆P = 0: ∆T =

kT 2 (cS − cL ) . q

(5.5.26)

As a rule, the concentration in the solid phase is much lower than that in the liquid phase, i.e. cS cL ; then (5.5.26) simpliﬁes to ∆T = −

kT 2 cL < 0 . q

(5.5.26 )

Since the entropy of the liquid is larger, or on melting, heat is absorbed, it follows that q > 0. As a result, the dissolution of a substance gives rise to a freezing-point depression. Note: On solidiﬁcation of a liquid, at ﬁrst (5.5.26 ) holds, with the initial concentration cL . Since however pure solvent precipitates out in solid form, the concentration cL increases, so that it requires further cooling to allow the freezing process to continue. Freezing of a solution thus occurs over a ﬁnite temperature interval.

The above results can be transferred directly to the evaporation process. To do this, we make the replacements L→G, S→L and obtain from (5.5.25) for the liquid phase (L) and the gas phase (G) the relation q ∆T + (vL − vG )∆P = (cL − cG )kT . T

(5.5.27)

Setting ∆P = 0 in (5.5.27), we ﬁnd ∆T =

kT 2 kT 2 (cL − cG ) ≈ cL > 0 , q q

(5.5.28)

a boiling-point elevation. In the last equation, cL cG was assumed (this no longer holds near the critical point). Setting ∆T = 0 in (5.5.27), we ﬁnd ∆P =

cL − c G c L − cG kT ≈ − kT , vL − vG vG

(5.5.29)

266

5. Real Gases, Liquids, and Solutions

a vapor-pressure reduction. When the gas phase contains only the vapor of the pure solvent, (5.5.29) simpliﬁes to ∆P = −

cL kT . vG

(5.5.30)

Inserting the ideal gas equation, P vG = kT , we have ∆P = −cL P = −cL (P0 + ∆P ) . Rearrangement of the last equation yields the relative pressure change: ∆P cL =− ≈ −cL , P0 1 + cL

(5.5.31)

known as Raoult’s law . The relative vapor-pressure reduction increases linearly with the concentration of the dissolved substance. The results derived here are in agreement with the qualitative considerations given at the beginning of this subsection.

Problems for Chapter 5 5.1 The rotational motion of a diatomic molecule is described by the angular variables ϑ and ϕ and the canonically conjugate momenta pϑ and pϕ with the p2

2 1 Hamilton function H = 2Iϑ + 2I sin 2 ϑ pϕ . Calculate the classical partition function for the canonical ensemble. (see footnote 4 to Eq. (5.1.15)). Result: Zrot = 2T Θr

5.2 Conﬁrm the formulas (5.4.13) for the critical pressure, the critical volume, and the critical temperature of a van der Waals gas and the expansion (5.4.18) of P (T, V ) around the critical point up to the third order in ∆v. 5.3 The expansion of the van der Waals equation in the vicinity of the critical point: (a) Why is it permissible in the determination of the leading order to leave oﬀ the 3 term ∆T (∆V )2 in comparison `to (∆V ´ ) ? (b) Calculate the correction O ∆T to the coexistence curve. ´ ` (c) Calculate the correction O (T − Tc )2 to the internal energy. 5.4 The equation of state for a van der Waals gas is given in terms of reduced variables in Eq. (5.4.16). Calculate the position of the inversion points (Chap. 3) in the p∗ , T ∗ diagram. Where is the maximum of the curve? 5.5 Calculate the jump in the speciﬁc heat cv for a van der Waals gas at a speciﬁc volume of v = vc . 5.6 Show in general and for the van der Waals equation that κs and cv exhibit the same behavior for T → Tc .

Problems for Chapter 5

267

5.7 Consider two metals 1 and 2 (with melting points T1 , T2 and temperature independent heats of melting q1 , q2 ), which form ideal mixtures in the liquid phase (i.e. as for small concentrations over the whole concentration range). In the solid phase these metals are not miscible. Calculate the eutectic point TE (see also Sect. 3.9.2). Hint: Set up the equilibrium conditions between pure solid phase 1 or 2 and the liquid phase. From these, the concentrations are determined: « „ Ti qi ; i = 1, 2 1− ci = eλi , where λi = kTi T using ∂(G/T ) = −H/T 2 , ∂T

qi = ∆Hi ,

G = µN .

5.8 Apply the van’t Hoﬀ formula (5.5.17 ) to the following simple example: the concentration of the dissolved substance is taken to be c = 0.01, the solvent is water (at 20◦ C); use ρH2 O = 1 g/cm3 (20◦ C). Find the osmotic pressure ∆P .

6. Magnetism

In this chapter, we will deal with the fundamental phenomenon of magnetism. We begin the ﬁrst section by setting up the density matrix, starting from the Hamiltonian, and using it to derive the thermodynamic relations for magnetic systems. Then we continue with the treatment of diamagnetic and paramagnetic substances (Curie and Pauli paramagnetism). Finally, in Sect. 6.5.1, we investigate ferromagnetism. The basic properties of magnetic phase transitions will be studied in the molecular-ﬁeld approximation (Curie–Weiss law, Ornstein-Zernike correlation function, etc.). The results obtained will form the starting point for the renormalization group theory of critical phenomena which is dealt with in the following chapter.

6.1 The Density Matrix and Thermodynamics 6.1.1 The Hamiltonian and the Canonical Density Matrix We ﬁrst summarize some facts about magnetic properties as known from electrodynamics and quantum mechanics. The Hamiltonian for N electrons in a magnetic ﬁeld H = curl A is: H=

N 2 1 e pi − A (xi ) − µspin · H (xi ) + WCoul . i 2m c i=1

(6.1.1)

The index i enumerates the electrons. The canonical momentum of the ith electron is pi and the kinetic momentum is mvi = pi − ec A (xi ). The charge and the magnetic moment are given by1 e = −e0 ,

µspin =− i

geµB Si ,

(6.1.2a)

where along with the elementary charge e0 , the Bohr magneton µB = 1

erg J e0 = 0.927 · 10−20 = 0.927 · 10−23 2mc Gauss T

QM I, p. 186

(6.1.2b)

270

6. Magnetism

as well as the Land´e-g-factor or the spectroscopic splitting factor of the electron ge = 2.0023

(6.1.2c)

ege were introduced. The quantity γ = 2mc = − geµB is called the magnetomechanical ratio or gyromagnetic ratio. The last term in (6.1.1) stands for the Coulomb interaction of the electrons with each other and with the nuclei. The dipole-dipole interaction of the spins is neglected here. Its consequences, such as the demagnetizing ﬁeld, will be considered in Sect. 6.6; see also remark (ii) at the end of Sect. 6.6.3. We assume that the magnetic ﬁeld H is produced by some external sources. In vacuum, B = H holds. We use here the magnetic ﬁeld H, corresponding to the more customary practice in the literature on magnetism. The current-density operator is thus given by2 N " δH e e = pi − A (xi ) , δ (x − xi ) δA (x) 2m c + i=1 0 +c curl µspin δ (x − xi ) i

j (x) ≡ −c

(6.1.3) with [A, B]+ = AB + BA. The current density contains a contribution from the electronic orbital motion and a spin contribution. For the total magnetic moment , one obtains3,4 : µ≡

1 2c

d3 x x × j (x) =

N / 0 e e xi × pi − A (xi ) + µspin . (6.1.4) i 2mc c i=1

When H is uniform, Eq. (6.1.4) can also be written in the form µ=−

∂H . ∂H

(6.1.5)

The magnetic moment of the ith electron for a uniform magnetic ﬁeld (see Remark (iv) in Sect. 6.1.3) is – according to Eq. (6.1.4) – given by e e2 Li − xi × A (xi ) 2mc 2mc2 e2 e (Li + ge Si ) − H x2i − xi (xi · H) . = 2 2mc 4mc

µi = µspin + i

2

3

4

(6.1.6)

The intermediate steps which lead to (6.1.3)–(6.1.5) will be given at the end of this section. J. D. Jackson, Classical Electrodynamics, 2nd edition, John Wiley and sons, New York, 1975, p. 18. Magnetic moments are denoted throughout by µ, except for the spin magnetic moments of elementary particles which are termed µspin .

6.1 The Density Matrix and Thermodynamics

271

If H = Hez , then (for a single particle) it follows that µi z =

∂H e e2 H 2 (Li + ge Si )z − , xi + yi2 = − 2 2mc 4mc ∂H

and the Hamiltonian is5 H=

N " 2 p

# e e2 H 2 2 2 − (Li + 2Si )z H + x + yi + WCoul . (6.1.7) 2m 2mc 8mc2 i i

i=1

Here, we have used ge = 2. We now wish to set up the density matrices for magnetic systems; we can follow the steps in Chap. 2 to do this. An isolated magnetic system is described by a microcanonical ensemble, ρMC = δ (H − E) /Ω (E, H)

with Ω (E, H) = Tr δ(H − E),

where, for the Hamiltonian, (6.1.1) is to be inserted. If the magnetic system is in contact with a heat bath, with which it can exchange energy, then one ﬁnds for the magnetic subsystem, just as in Chap. 2, the canonical density matrix ρ=

1 −H/kT e . Z

(6.1.8)

The normalization factor is given by the partition function Z = Tr e−H/kT .

(6.1.9a)

The canonical parameters (natural variables) are here the temperature, whose reciprocal is deﬁned as in Chap. 2 in the microcanonical ensemble as the derivative of the entropy of the heat bath with respect to its energy, and the external magnetic ﬁeld H.6 Correspondingly, the canonical free energy, F (T, H) = −kT log Z ,

(6.1.9b)

is a function of T and H. The entropy S and the internal energy E are, by deﬁnition, calculated from S = −k log ρ = 5 6

1 (E − F ) , T

(6.1.10)

See e.g. QM I, Sect. 7.2. In this chapter we limit our considerations to magnetic eﬀects. Therefore, the particle number and the volume are treated as ﬁxed. For phenomena such as magnetostriction, it is necessary to consider also the dependence of the free energy on the volume and more generally on the deformation tensor of the solid (see also the remark in 6.1.2.4).

272

6. Magnetism

and E = H .

(6.1.11)

The magnetic moment of the entire body is deﬁned as the thermal average of the total quantum-mechanical magnetic moment ∂H . (6.1.12) M ≡ µ = − ∂H The magnetization M is deﬁned as the magnetic moment per unit volume, i.e. for a uniformly magnetized body 1 M V and, in general, M = d3 x M(x) . M=

(6.1.13a)

(6.1.13b)

For the diﬀerential of F , we ﬁnd from (6.1.9)−(6.1.10) dF = (F − E)

dT − M · dH ≡ −SdT − M · dH , T

that is

∂F = −S ∂T H

and

∂F ∂H

(6.1.14a)

= −M.

(6.1.14b)

T

Using equation (6.1.10), one can express the internal energy E in terms of F and S and obtain from (6.1.14a) the First Law for magnetic systems: dE = T dS − MdH .

(6.1.15)

The internal energy E contains the interaction of the magnetic moments with the magnetic ﬁeld (see (6.1.7)). Compared to a gas, we have to make the following formal replacements in the First Law: V → H, P → M. Along with the (canonical) free energy F (T, H), we introduce also the Helmholtz free energy7 A (T, M) = F (T, H) + M · H .

(6.1.16)

Its diﬀerential is dA = −SdT + HdM , i.e.

7

∂A ∂T

M

(6.1.17a)

= −S

and

∂A ∂M

=H.

(6.1.17b)

T

The notation of the magnetic potentials is not uniform in the literature. This is true not only of the choice of symbols; even the potential F (T, H), which depends on H, is sometimes referred to as the Helmholtz free energy.

6.1 The Density Matrix and Thermodynamics

273

6.1.2 Thermodynamic Relations ∗

6.1.2.1 Thermodynamic Potentials

At this point, we summarize the deﬁnitions of the two potentials introduced in the preceding subsection. The following compilation, which indicates the systematic structure of the material, can be skipped over in a ﬁrst reading: F = F (T, H) = E − T S ,

dF = −SdT − M dH

A = A(T, M) = E − T S + M · H ,

dA = −SdT + H dM . (6.1.18b)

(6.1.18a)

In comparison to liquids, the thermodynamic variables here are T, H and M instead of T , P and V . The thermodynamic relations listed can be read oﬀ from the corresponding relations for liquids by making the substitutions V → −M and P → H. There is also another analogy between magnetic systems and liquids: the density matrix of the grand potential contains the term −µN , which in a magnetic system corresponds to −H·M. Particularly in the low-temperature region, where the properties of a magnetic system can be described in terms of spin waves (magnons), this analogy is useful. There, the value of the magnetization is determined by the number of thermally-excited spin waves. Therefore, we ﬁnd the correspondence M ↔ N and H ↔ µ. Of course the Maxwell relations follow from (6.1.15) and (6.1.18a,b)

∂T ∂M ∂S ∂M =− , = . (6.1.19) ∂H S ∂S H ∂H T ∂T H ∗

6.1.2.2 Magnetic Response Functions, Speciﬁc Heats, and Susceptibilities Analogously to the speciﬁc heats of liquids, we deﬁne here the speciﬁc heats CM and CH (at constant M and H) as8

2 ∂S ∂ A CM ≡ T = −T (6.1.20a) ∂T M ∂T 2 M

2 ∂S ∂E ∂ F = = −T . (6.1.20b) CH ≡ T ∂T H ∂T H ∂T 2 H Instead of the compressibilities as for liquids, in the magnetic case one has the isothermal susceptibility

1 ∂2F ∂M χT ≡ =− (6.1.21a) ∂H T V ∂H 2 T 8

To keep the notation simple, we will often write H and M as H and M , making the assumption that M is parallel to H and that H and M are the components in the direction of H.

274

6. Magnetism

and the adiabatic susceptibility

∂M 1 ∂ 2E χS ≡ = . ∂H S V ∂H 2 S

(6.1.21b)

In analogy to Chap. 3, one ﬁnds that CH − CM = T V α2H /χT , χT − χ S =

(6.1.22a)

T V α2H /CH

(6.1.22b)

and CH χT = . CM χS

(6.1.22c)

Here, we have deﬁned

∂M αH ≡ . ∂T H

(6.1.23)

Eq. (6.1.22a) can also be rewritten as CH − CM = T V α2M χT ,

(6.1.22d)

where

αM =

∂H ∂T

=− M

αH χT

(6.1.22e)

was used. ∗

6.1.2.3 Stability Criteria and the Convexity of the Free Energy

One can also derive inequalities of the type (3.3.5) and (3.3.6) for the magnetic susceptibilities and the speciﬁc heats: χT ≥ 0 ,

CH ≥ 0

and

CM ≥ 0 .

(6.1.24a,b,c)

To derive these inequalities on a statistical-mechanical basis, we assume that the Hamiltonian has the form H = H0 − µ · H ,

(6.1.25)

where H thus enters only linearly and µ commutes with H. It then follows that

∂ Tr e−βH µ 1 ∂µ 1 β 2 χT = (µ − µ) ≥ 0 (6.1.26a) = = V ∂H T V ∂H Tr e−βH T V

6.1 The Density Matrix and Thermodynamics

275

and

CH =

∂ H ∂T

=

H

∂ Tr e−βH H ∂T Tr e−βH

= H

1 2 (H − H) ≥ 0 , 2 kT (6.1.26b)

with which we have demonstrated (6.1.24a) and (6.1.24b). Eq. (6.1.24c) can be shown by taking the second derivative of A(T, M) = F (T, H) + HM with respect to the temperature at constant M (problem 6.1). As a result, 9 F (T, H) in T and in H,while A(T, M) is concave in T and convex is2concave ∂H CM ∂ A ∂2A =− T ≤0 , = ∂M T = 1/χT ≥ 0 . in M: ∂T 2 ∂M2 H

T

In this derivation, we have used the fact that the Hamiltonian H has the general form (6.1.25), and therefore, diamagnetic eﬀects (proportional to H 2 ) are negligible. Remark: In analogy to the extremal properties treated in Sect. 3.6.4, the canonical free energy F for ﬁxed T and H in magnetic systems strives towards a minimal value, as does the Helmholtz free energy A for ﬁxed T and M . At these minima, the stationarity conditions δF = 0 and δA = 0 hold, i.e.: dF < 0 when T and H are ﬁxed, and dA < 0 when T and M are ﬁxed.

6.1.2.4 Internal Energy E ≡ H is the internal energy, which is found in a natural manner from statistical mechanics. It contains the energy of the material including the eﬀects of the electromagnetic ﬁeld, but not the ﬁeld energy itself. It is usual to introduce a second internal energy, also, which we denote by U and which is deﬁned as U =E +M·H ;

(6.1.27a)

it thus has the complete diﬀerential dU = T dS + HdM . From this, we derive

∂U T = , ∂S M

H=

and the Maxwell relation

∂H ∂T = . ∂S M ∂M S 9

(6.1.27b)

∂U ∂M

(6.1.27c) S

(6.1.28)

See also R. B. Griﬃths, J. Math. Phys. 5, 1215 (1964). In fact, it is suﬃcient for the proof of (6.1.24a) to show that µ enters H linearly. Cf. M. E. Fisher, Rep. Progr. Phys. XXX, 615 (1967), p. 644.

276

6. Magnetism

Remarks: (i) As was emphasized in footnote 5, throughout this chapter the particle number and the volume are treated as ﬁxed. In the case of variable volume and variable particle number, the generalization of the First Law takes on the form dU = T dS − P dV + µdN + HdM

(6.1.29)

and, correspondingly, dE = T dS − P dV + µdN − MdH .

(6.1.30)

The grand potential Φ(T, V, µ, H) = −kT log Tr e−β(H−µN )

(6.1.31a)

then has the diﬀerential dΦ = −SdT − P dV − µdN − MdH ,

(6.1.31b)

where the chemical potential µ is not to be confused with the microscopic magnetic moment µ. (ii) We note that the free energies of the crystalline solid are not rotationally invariant, but instead are invariant only with respect to rotations of i the corresponding point group. Therefore, the susceptibility χij = ∂M ∂Hj is a second-rank tensor. In this textbook, we present the essential statistical methods, but we forgo a discussion of the details of solid-state physics or element speciﬁc aspects. The methods presented here should permit the reader to master the complications which arise in treating real, individual problems.

6.1.3 Supplementary Remarks (i) The Bohr–van Leeuwen Theorem. The content of the Bohr–van Leeuwen theorem is the nonexistence of magnetism in classical statistics. The classical partition function for a charged particle in the electromagnetic ﬁeld is given by 3N 3N d p d x −H({pi − e A(xi )},{xi })/kT c Zcl = . (6.1.32) e (2π)3N N ! Making the substitution pi = pi − ec A (xi ), we can see that Zcl becomes ∂F = 0, and independent of A and thus also of H . Then we have M = − ∂H 2 ∂ F χ = − V1 ∂H = 0. Since the spin is also a quantum-mechanical phenomenon, 2 dia-, para-, and ferromagnetism are likewise quantum phenomena. One might

6.1 The Density Matrix and Thermodynamics

277

ask how this statement can be reconciled with the ‘classical’ Langevin paramagnetism which will be discussed below. In the latter, a large but ﬁxed value of the angular momentum is assumed, so that a non-classical feature is introduced into the theory. In classical physics, angular momenta, atomic radii, etc. vary continuously and without limits.10 (ii) Here, we append the simple intermediate computations leading to (6.1.3)– δH (6.1.5). In (6.1.3), we need to evaluate −c δA(x) . The ﬁrst term in (6.1.1) evidently leads to the ﬁrst term in (6.1.3). In the component of the current jα , taking the derivative of the second term leads to

c

N X

δ δAα (x)

=c

N X i=1

µspin · curl A(xi ) = c i

i=1

µspin iβ βγδ

N X

∂ Aδ (xi ) = ∂x iγ i=1 ! N h i X spin =c rot µi δ (x − xi ) . δ

δAα (x)

∂ δ (x − xi ) δαδ ∂xiγ

µspin iβ βγδ

i=1

α

Pairs of Greek indices imply a summation. Since the derivative of the third term in (6.1.1) yields zero, we have demonstrated (6.1.3). (iii) In (6.1.4), the ﬁrst term is obtained in a readily-apparent manner from the ﬁrst term in (6.1.3). For the second term, we carry out an integration by parts and use ∂δ xβ = δδβ , obtaining N „Z h i« 1X δ (x − x ) = d3 x x × curl µspin i i 2 i=1 α N Z h i 1X δ (x − x ) = = d3 x αβγ xβ γδρ ∂δ µspin i iρ 2 i=1 N Z 1X =− d3 x αβγ γδρ δδβ µspin iρ δ (x − xi ) = 2 i=1 N Z N X 1X =− d3 x (−2δαρ ) µspin µspin iρ δ (x − xi ) = iα , 2 i=1 i=1 with which we have demonstrated (6.1.4). (iv) Finally, we show the validity of (6.1.5). We can write the vector potential of a uniform magnetic ﬁeld in the form A = 12 H × x, since curl A = 12 (H (∇ · x) − (H · ∇) x) yields H. To obtain the derivative, we use 12 σατ xiτ for the derivative with respect to Hα after the second equals sign below, ﬁnding −

N X ∂H e ” “ e” ∂ 1 2 “ =− pi − A σρτ Hρ xiτ + µspin − iα = ∂Hα 2m c c ∂H σ α 2 i=1 N “ e ”” e X“ xi × pi − A + µspin = iα , 2mc i=1 c α

(6.1.33)

which is in fact the right-hand side of (6.1.4). 10

A detailed discussion of this theorem and the original literature citations are to be found in J. H. van Vleck, The Theory of Electric and Magnetic Susceptibility, Oxford, University Press, 1932.

278

6. Magnetism

In the Hamiltonian (6.1.1), WCoul contains the mutual Coulomb interaction of the electrons and their interactions with the nuclei. The thermodynamic relations derived in Sect. (6.1.2) are thus generally valid; in particular, they apply to ferromagnets, since there the decisive exchange interaction is merely a consequence of the Coulomb interactions together with Fermi–Dirac statistics. In addition to the interactions included in (6.1.1), there are also the magnetic dipole interaction between magnetic moments and the spin-orbit interaction,11 which lead among other things to anisotropy eﬀects. The derived thermodynamic relations also hold for these more general cases, whereby the susceptibilities and speciﬁc heats become shape-dependent owing to the long-range dipole interactions. In Sect. 6.6, we will take up the eﬀects of the dipole interactions in more details. For elliptical samples, the internal magnetic ﬁeld is uniform, Hi = H − DM, where D is the demagnetizing tensor (or simply the appropriate demagnetizing factor, if the ﬁeld is applied along one of the principal axes). We will see that instead of the susceptibility with respect to the external ﬁeld H, one can employ the susceptibility with respect to the macroscopic internal ﬁeld, and that this susceptibility is shape-independent.12 In the following four sections, which deal with basic statistical-mechanical aspects, we leave the dipole interactions out of consideration; this is indeed quantitatively justiﬁed in many situations. In the next two sections 6.2 and 6.3, we deal with the magnetic properties of noninteracting atoms and ions; these can be situated within solids. The angular momentum quantum numbers of individual atoms in their ground states are determined by Hund’s rules.13

6.2 The Diamagnetism of Atoms We consider atoms or ions with closed electronic shells, such as for example helium and the other noble gases or the alkali halides. In this case, the quantum numbers of the orbital angular momentum and the total spin in the ground state are zero, S = 0 and L = 0, and as a result the total angular momentum J = L + S is also J = 0.14 Therefore, we have 11

12

13 14

The spin-orbit interaction ∝ L · S leads in eﬀective spin models to anisotropic interactions. The orbital angular momentum is inﬂuenced by the crystal ﬁeld of the lattice, transferring the anisotropy of the lattice to the spin. For non-elliptical samples, the magnetization is not uniform. In this case, ∂M ∂H depends on position within the sample and has only a local ` ∂M ´signiﬁcance. It is tot then expedient to introduce a total susceptibility χT,S = ∂H T,S , which diﬀers from (6.1.33) in the homogeneous case only by a factor of V . See e.g. QM I, Chap. 13 and Table I.12 The diamagnetic contribution is also present in other atoms, but in the magnetic ﬁelds which are available in the laboratory, it is negligible compared to the paramagnetic contribution.

6.2 The Diamagnetism of Atoms

279

L |0 = S |0 = J |0 = 0, where |0 designates the ground state. The paramagnetic contribution to the Hamiltonian (6.1.7) thus vanishes in every order of perturbation theory. It suﬃces to treat the remaining diamagnetic term in (6.1.7) in ﬁrst-order perturbation theory, since all the excited states lie at much higher energies. Owing to the of the wavefunc symmetry rotational tions of closed shells, we ﬁnd 0| i x2i + yi2 |0 = 23 0| i ri2 |0 and, for the energy shift of the ground state, e2 H 2 0| ri2 |0 . 2 12mc i

E1 =

(6.2.1)

From this it follows for the magnetic moment and the susceptibility of a single atom: e2 0| i ri2 |0 e2 0| i ri2 |0 ∂E1 ∂µz =− = − µz = − H, χ = , ∂H 6mc2 ∂H 6mc2 (6.2.2) where the sums run over all the electrons in the atom. The magnetic moment is directed oppositely to the applied ﬁeld and the susceptibility is negative. We can estimate the magnitude of this so called Langevin diamagnetism using the Bohr radius: 25 × 10−20 × 10−16 3 cm ≈ −5 × 10−30 cm3 , 6 × 10−27 × 1021 cm3 cm3 ≈ −3 × 10−6 . χ per mole = −5 × 10−30 × 6 × 1023 mole mole χ=−

The experimental values of the molar susceptibility of the noble gases are collected in Table 6.1. Table 6.1. Molar susceptibilities of the noble gases

−6

χ in 10

3

cm /mole

He

Ne

Ar

Kr

Xe

-1.9

-7.2

-15.4

-28.0

-43.0

An intuitively apparent interpretation of this diamagnetic susceptibility runs as follows: the ﬁeld H induces an additional current ∆j = −er∆ω, whereby the orbital frequency of the electronic motion increases by the Larmor frequency ∆ω = eH . The sign of this change corresponds to Lenz’s law, so that both the magnetic 2mc moment µz and the induced magnetic ﬁeld are opposite to the applied ﬁeld H: µz ∼

r∆j r 2 ∆ωe e2 r 2 H . ∼− ∼− 2c 2c 4mc2

We also note that the result (6.2.2) is proportional to the square of the Bohr radius and therefore to the fourth power of , conﬁrming the quantum nature of magnetic phenomena.

280

6. Magnetism

6.3 The Paramagnetism of Non-coupled Magnetic Moments Atoms and ions with an odd number of electrons, e.g. Na, as well as atoms and ions with partially ﬁlled inner shells, e.g. Mn2+ , Gd3+ , or U4+ (transition elements, ions which are isoelectronic with transition elements, rare-earth and actinide elements) have nonvanishing magnetic moments even when H = 0, µ=

e e (L + ge S) = (J + S) 2mc 2mc

(ge = 2) .

(6.3.1)

Here, J = L + S is the total angular momentum operator. For relatively low external magnetic ﬁelds (i.e. eH/mc spin-orbit coupling)) with H applied along the z-axis, the theory of the Zeeman eﬀect 15 gives the energy-level shifts ∆EMJ = gµB MJ H ,

(6.3.2)

where MJ runs over the values MJ = −J, . . . , J 16 and the Land´e factor g =1+

J(J + 1) + S(S + 1) − L(L + 1) 2J(J + 1)

(6.3.3)

was used. Familiar special cases are L = 0 : g = 2, MJ ≡ MS = ± 21 and S = 0 : g = 1, MJ ≡ ML = −L, . . . , L. The Land´e factor can be made plausible in the classical picture where L and S precess independently around the spatially ﬁxed direction of the constant of the motion J. Then we ﬁnd: J2 + J · S J · (L + 2S) Jz = Jz |J| |J| J2 ` 2 ´! 2 2 1 S + 2 J − L − S2 . = Jz 1 + J2

(L + 2S)z =

The partition function then becomes ) Z=

J

*N e

−ηm

m=−J

=

sinh η (2J + 1) /2 sinh η/2

N ,

(6.3.4)

with the abbreviation η=

gµB H . kT

Here, we have used the fact that 15 16

Cf. e.g. QM I, Sect. 14.2 J = L + S, Jz |mj = mj |mj

(6.3.5)

6.3 The Paramagnetism of Non-coupled Magnetic Moments J

e−ηm = e−ηJ

2J

eηr = e−ηJ

r=0

m=−J

eη(2J+1) − 1 = eη − 1

281

sinh η (2J + 1) /2 sinh η/2

For the free energy, we ﬁnd from (6.3.4) # " sinh η (2J + 1) /2 , F (T, H) = −kT N log sinh η/2

.

(6.3.6)

from which we obtain the magnetization M =−

1 ∂F = ngµB JBJ (η) V ∂H

(6.3.7)

(n = N V ). The magnetization is oriented parallel to the magnetic ﬁeld H. In Eq. (6.3.7) we have introduced the Brillouin function BJ , which is deﬁned as " # 1 1 1 η 1 BJ (η) = (J + ) coth η(J + ) − coth (6.3.8) J 2 2 2 2 (Fig. 6.1). We now consider the asymptotic limiting cases: 1 η J +1 + + O η 3 , BJ (η) = η + O η3 η 3 3 (6.3.9a)

η→0:

coth η =

η→∞:

BJ (∞) = 1.

and (6.3.9b)

Fig. 6.1. The Brillouin function for J = 1/2, 1, 3/2, 2, ∞ as a function of x = gµB J H = ηJ. For classical momentsB∞ is identical to the Langevin function kT

282

6. Magnetism

Inserting (6.3.9a) into (6.3.7), we obtain for low applied ﬁelds (H kT/JgµB ) J(J + 1)H , 3kT while from (6.3.9b), for high ﬁelds (H kT /JgµB ,), we ﬁnd M = n (gµB )

2

M = ngµB J

(6.3.10a)

(6.3.10b)

This signiﬁes complete alignment (saturation) of the magnetic moments. An important special case is represented by spin- 21 systems. Setting J = 12 in (6.3.8), we ﬁnd cosh η2 cosh2 η2 + sinh2 η2 η η B 12 (η) = 2 coth η − coth = − . η η η = tanh 2 sinh 2 cosh 2 sinh 2 2 (6.3.11) This result can be more directly obtained by using the fact that for spin S = 1/2, the partition function of a spin is given by Z = 2 cosh η/2 and the average value of the magnetization by M = ngµB Z −1 sinh η/2. Letting J = ∞, while at the same time gµB → 0, so that µ = gµB J remains ﬁnite, we ﬁnd B∞ (η) = coth ηJ −

µH kT 1 = coth − . ηJ kT µH

(6.3.12a)

B∞ (η) is called the Langevin function for classical magnetic moments µ; together with (6.3.7), it determines the magnetization

µH kT M = nµ coth − (6.3.12b) kT µH of “classical” magnetic moments of magnitude µ. A classical magnetic moment µ can be oriented in any direction in space; its energy is E = −µH cos ϑ, where ϑ is the angle between the ﬁeld H and themagnetic moment µ. The classical partition function for one particle is Z = dΩ e−E/kT and leads via (6.1.9b) once again to (6.3.12b). Finally, for the susceptibility we obtain 2

χ = n (gµB )

J B (η) . kT J

In small magnetic ﬁelds H χCurie = n (gµB )2

(6.3.13) kT JgµB ,

J(J + 1) . 3kT

this gives the Curie law (6.3.14)

The magnetic behavior of non-coupled moments characterized by (6.3.7), (6.3.13), and (6.3.14) is termed paramagnetism. The Curie law is typical of

6.3 The Paramagnetism of Non-coupled Magnetic Moments

283

Fig. 6.2. The entropy, the internal energy, and the speciﬁc heat of a spin- 21 paramagnet

preexisting elementary magnetic moments which need only be oriented by the applied ﬁeld, in contrast to the polarization of harmonic oscillators, whose moments are induced by the ﬁeld (cf. problem 6.4). We include a remark about the magnitudes. The diamagnetic susceptibility per mole, from the estimate which follows Eq. (6.2.2), is equal to about χmole ≈ −10−5 cm3 /mole. The paramagnetic susceptibility at room temperature is roughly 500 times larger, i.e. χmole ≈ 10−2 –10−3 cm3 /mole. The entropy of a paramagnet is * ) ) *

sinh η(2J+1) ∂F 2 − ηJBJ (η) . S=− = N k log (6.3.15) ∂T H sinh η2 For spin 12 , (6.3.15) simpliﬁes to

µB H µB H µB H − tanh S = N k log 2 cosh kT kT kT

(6.3.16)

with the limiting case S = N k log 2 for

H→0.

(6.3.16 )

The entropy, the internal energy, and the speciﬁc heat of the paramagnet are reproduced in Figs. 6.2a,b,c. The bump in the speciﬁc heat is typical of 2-level systems and is called a Schottky anomaly in connection with defects.

284

6. Magnetism

Van Vleck paramagnetism: The quantum number of the total angular momentum also becomes zero, J = 0, when a shell has just one electron less than half full. In this case, according to Eq. (6.3.2) we have indeed 0| J + S |0 = 0, but the paramagnetic term in (6.1.7) yields a nonzero contribution in second order perturbation theory. Together with the diamagnetic term, one obtains for the energy shift of the ground state ∆E0 = −

X | 0| (L + 2S) · H |n |2 X 2 e2 H 2 + 0| (xi + yi2 ) |0 . 2 E − E 8mc n 0 n i

(6.3.17)

The ﬁrst, paramagnetic term, named for van Vleck 17 , which also plays a role in the magnetism of molecules18 , competes with the diamagnetic term.

6.4 Pauli Spin Paramagnetism We consider now a free, three-dimensional electron gas in a magnetic ﬁeld and restrict ourselves initially to the coupling of the magnetic ﬁeld to the electron spins. The energy eigenvalues are then given by Eq. (6.1.7): p± =

1 p2 ± ge µB H . 2m 2

(6.4.1)

The energy levels are split by the magnetic ﬁeld. Electrons whose spins are aligned parallel to the ﬁeld have higher energies, and these states are therefore less occupied (see Fig. 6.3).

Fig. 6.3. Orientation (a) of the spins, and (b) of the magnetic moments. (c) The energy as a function of p (on the left for positive spins and on the right for negative spins)

17

18

J. H. van Vleck, The Theory of Magnetic and Electric Susceptiblities, Oxford University Press, 1932. Ch. Kittel, Introduction to Solid State Physics, Third edition, John Wiley, New York, 1967

6.4 Pauli Spin Paramagnetism

285

The number of electrons in the two states is found to be

2 ∞ 1 1 V p 1 3 ± g ν()n ± g d p n µ H = d µ H , N± = e B e B 3 2m 2 2 2 (2π) 0

(6.4.2) where the density of states has been introduced: gV 3 1/2 3 ν() = d p δ( − ) = N ; (6.4.3) p 3 2 3/2 (2π) F it fulﬁlls the normalization condition 0 F d ν() = N . In the case that ge µB H µ ≈ F , we can expand in terms of H: ∞ 1 1 N± = d ν() n () ± n () ge µB H + O H 2 . (6.4.4) 2 2 0

For the magnetization, using the above result we obtain: ∞ µ2B H M = −µB (N+ − N− )/V = − d ν()n () + O H 3 , V

(6.4.5)

0

where we have set ge = 2. For T → 0, we ﬁnd from (6.4.5) the magnetization 3 NH M = µ2B ν(F )H/V + O H 3 = µB 2 (6.4.6) + O H3 2 V F and the magnetic susceptibility N 3 χP = µ2B (6.4.7) + O H2 . 2 V F This result describes the phenomenon of Pauli spin paramagnetism. Supplementary remarks: (i) For T = 0, we must take the change of the chemical potential into account, making use of the Sommerfeld expansion: ∞ µ 2 2 π 2 (kT ) ν (µ) + O H 2 , T 4 N = d ν()n() + O H = d ν() + 6 0

0

F

π 2 (kT ) ν (F ) + O H 2 , T 4 . 6 2

d ν() + (µ − F ) ν(F ) +

=

(6.4.8)

0

Since the ﬁrst term on the right-hand side is equal to N , we write π 2 (kT ) ν (F ) + O H 2, T 4 . 6 ν(F ) 2

µ − F = −

(6.4.9)

286

6. Magnetism

Integrating by parts, we obtain from (6.4.5) and (6.4.9) µ2 H M= B V

∞

d ν ()n() + O H 3

0

2 3 4 π 2 (kT ) ν(µ) + ν (µ) + O H , T (6.4.10) = V 6 2 ν (F )2 π 2 (kT ) µ2B H ν(F ) − − ν (F ) + O H 3, T 4 . = V 6 ν(F ) µ2B H

(ii) The Pauli susceptibility (6.4.7) can be interpreted similarly to the linear speciﬁc heat of a Fermi gas (see Sect. 4.3.2): χP = χCurie

ν(F ) kT = µ2B ν(F )/V . N

(6.4.11)

Naively, one might expect that the susceptibility of N electrons would be equal to the Curie susceptibility χCurie from Eq. (6.3.14) and therefore would diverge as 1/T . It was Pauli’s accomplishment to realize that not all of the electrons contribute, but instead only those near the Fermi energy. The number of thermally excitable electrons is kT ν(F ). (iii) The Landau quasiparticle interaction (see Sect. 4.3.3e, Eq. 4.3.29c) yields χP =

µ2B ν(F ) . V (1 + Fa )

(6.4.12)

In this expression, Fa is an antisymmetric combination of the interaction parameters.19 (iv) In addition to Pauli spin paramagnetism, the electronic orbital motions give rise to Landau diamagnetism 20 χL = −

e2 kF . 12π 2 mc2

(6.4.13)

For a free electron gas, χL = − 31 χP . The lattice eﬀects in a crystal have diﬀering consequences for χL and χP . Eq. (6.4.13) holds for free electrons neglecting the Zeeman term. The magnetic susceptibility for free spin- 12 fermions is composed of three parts: it is the sum χ = χP + χL + χOsc . 19

20

D. Pines and Ph. Nozi`eres, The Theory of Quantum Liquids Vol. I: Normal Fermi Liquids, W. A. Benjamin, New York 1966, p. 25 See e.g. D. Wagner, Introduction to the Theory of Magnetism, Pergamon Press, Oxford, 1972.

6.5 Ferromagnetism

287

χOsc is an oscillatory part, which becomes important at high magnetic ﬁelds H and is responsible for de Haas–van Alphen oscillations. (v) Fig. 6.3c can also be read diﬀerently from the description given above. If one introduces the densities of states for spin ±/2 V ν± () = d3 p δ( − p± ) 3 (2π) p2 V 1 3 = d p δ − µ H ± g e B 3 2m 2 (2π) ∞

1 1 mV dp p Θ ∓ ge µB H δ p − 2m ∓ ge µB H = 2π 2 3 2 2 0

=N

3 3/2 4F

1/2 1 1 Θ ∓ ge µB H ∓ ge µB H , 2 2

then the solid curves which are drawn on the left and the right also refer to ν+ () and ν− ().

6.5 Ferromagnetism 6.5.1 The Exchange Interaction Ferromagnetism and antiferromagnetism are based on variations of the exchange interaction, which is a consequence of the Pauli principle and the Coulomb interaction (cf. the remark following Eq. (6.1.33)). In the simplest case of the exchange interaction of two electrons, two atoms or two molecules with the spins S1 and S2 , the interaction has the form ±J S1 · S2 , where J is a positive constant which depends on the distance between the spins. The exchange constant ±J is determined by the overlap integrals, containing the Coulomb interaction.21 When the exchange energy is negative, E = −J S1 · S2 ,

(6.5.1a)

then a parallel spin orientation is favored. This leads in a solid to ferromagnetism (Fig. 6.4b); then below the Curie temperature Tc , a spontaneous magnetization occurs within the solid. When the exchange energy is positive, E = J S1 · S2 ,

(6.5.1b)

then an antiparallel spin orientation is preferred. In a suitable lattice structure, this can lead to an antiferromagnetic state: below the N´eel temperature TN , an alternating (staggered) magnetic order occurs (Fig. 6.4c). Above the 21

See Chaps. 13 and 15, QM I

288

6. Magnetism

Fig. 6.4. A crystal lattice of magnetic ions. The spin Sl is located at the position xl , and l denumerates the lattice sites. (a) the paramagnetic state; (b) the ferromagnetic state; (c) the antiferromagnetic state.

respective transition temperature (TC or TN ), a paramagnetic state occurs (Fig. 6.4a). The exchange interaction is, to be sure, short-ranged; but owing to its electrostatic origin it is in general considerably stronger than the dipoledipole interaction. Examples of ferromagnetic materials are Fe, Ni, EuO; and typical antiferromagnetic materials are MnF2 and RbMnF3 . In the rest of this section, we turn to the situation described by equation (6.5.1a), i.e. to ferromagnetism, and return to (6.5.1b) only in the discussion of phase transitions. We now imagine that the magnetic ions are located on a simple cubic lattice with lattice constant a, and that a negative exchange interaction (J > 0) acts between them (Fig. 6.4a). The lattice sites are enumerated by the index l. The position of the lth ion is denoted by xl and its spin is Sl . All the pairwise interaction energies of the form (6.5.1a) contribute to the total Hamiltonian 22 : 1 H=− Jll Sl · Sl . (6.5.2) 2 l,l

Here, we have denoted the exchange interaction between the spins at the lattice sites l and l by Jll . The sum runs over all l and l , whereby the factor 1/2 guarantees that each pair of spins is counted only once in (6.5.2). The exchange interaction obeys Jll = Jl l , and we set Jll = 0 so that we do not need to exclude the occurrence of the same l-values in the sum. The Hamiltonian (6.5.2) represents the Heisenberg model 23 . Since only scalar products of spin vectors occur, it has the following important property: H 22

23

In fact, there are also interactions within a solid between more than just two spins, which we however neglect here. The direct exchange described above occurs only when the moments are near enough so that their wavefunctions overlap. More frequently, one ﬁnds an indirect exchange, which couples more distant moments. The latter acts via an intermediate link, which can be a quasi-free electron in a metal or a bound electron in an insulator. The resulting interaction is called in the ﬁrst case the RKKY (Rudermann, Kittel, Kasuya, Yosida) interaction and in the second, it is referred

6.5 Ferromagnetism

289

is invariant with respect to a common rotation of all the spin vectors. No direction is especially distinguished and therefore the ferromagnetic order which can occur may point in any arbitrary direction. Which direction is in fact chosen by the system is determined by small anisotropy energies or by an external magnetic ﬁeld. In many substances, this rotational invariance is nearly ideally realized, e.g. in EuO, EuS, Fe and in the antiferromagnet RbMnF3 . In other cases, the anisotropy of the crystal structure may have the eﬀect that the magnetic moments orient in only two directions, e.g. along the positive and negative z-axis, instead of in an arbitrary spatial direction. This situation can be described by the Ising model H=−

1 Jll Slz Slz . 2

(6.5.3)

l,l

This model is considerably simpler than the Heisenberg model (6.5.2), since the Hamiltonian is diagonal in the spin eigenstates of Slz . But even for (6.5.3), the evaluation of the partition function is in general not trivial. As we shall see, the one-dimensional Ising model can be solved exactly in an elementary way for an interaction restricted to the nearest neighbors. The solution of the two-dimensional model, i.e. the calculation of the partition function, requires special algebraic or graph-theoretical methods, and in three dimensions the model has yet to be solved exactly. When the lattice contains N sites, then the partition function Z = Tr e−βH has contributions from all together 2N conﬁgurations (every spin can take on the two values ±/2 independently of all the others). A naive summation over all these conﬁgurations is possible even for the Ising model only in one dimension. In order to understand the essential physical eﬀects which accompany ferromagnetism, in the next section we will apply the molecular ﬁeld approximation. It can be carried out for all problems related to ordering. We will demonstrate it using the Ising model as an example. 6.5.2 The Molecular Field Approximation for the Ising Model We consider the Hamiltonian of the Ising model in an external magnetic ﬁeld H=−

1 J(l − l ) σl σl − h σl . 2 l,l

(6.5.4)

l

to as superexchange (see e.g. C. M. Hurd, Contemp. Phys. 23, 469 (1982)). Also in cases where direct exchange is not predominant and even for itinerant magnets (with 3d and 4s electrons which are not localized, but instead form bands), the magnetic phenomena, in particular their behavior near the phase transition, can be described using an eﬀective Heisenberg model. A derivation of the Heisenberg model from the Hubbard model can be found in D. C. Mattis, The Theory of Magnetism, Harper and Row, New York, 1965.

290

6. Magnetism

In comparison to (6.5.3), equation (6.5.4) contains some changes of notation. Instead of the spin operators Slz , we have introduced the Pauli spin matrices σlz and use the eigenstates of the σlz as basis functions; their eigenvalues are σl = ±1 for every l . The Hamiltonian becomes simply a function of (commuting) numbers. By writing the exchange interaction in the form J(l − l ) (J(l − l ) = J(l − l) = Jll 2 /4, J(0) = 0), we express the fact that the system is translationally invariant, i.e. J(l − l ) depends only on the distance between the lattice sites. The eﬀect of an applied magnetic ﬁeld is represented by the term −h l σl . The factor − 12 gµB has been combined with the magnetic ﬁeld H into h = − 12 gµB H; the sign convention for h is chosen so that the σl are aligned parallel to it. Due to the translational invariance of the Hamiltonian, it proves to be expedient for later use to introduce the Fourier transform of the exchange coupling, ˜ J(k) = J(l)e−ik·xl . (6.5.5) l

˜ Frequently, we will require J(k) for small wavenumbers k. Due to the ﬁnite range of J(l − l ), we can expand the exponential functions in (6.5.5) ˜ J(k) =

J(l) −

l

1 2 (k · xl ) J(l) + . . . . 2

(6.5.5 )

l

For cubic and square lattices, and in general when reﬂection symmetry is present, the linear term in k makes no contribution. We can interpret the Hamiltonian (6.5.4) in the following manner: for some conﬁguration of all the spins σl , a local ﬁeld hl = h + J(l − l ) σl (6.5.6) l

acts on an arbitrarily chosen spin σl . If hl were a ﬁxed applied ﬁeld, we could immediately write down the partition function for the spin σl . Here, however, the ﬁeld hl depends on the conﬁguration of the spins and the value of σl itself enters into the local ﬁelds which act upon its neighbors. In order to avoid this diﬃculty by means of an approximation, we replace the local ﬁeld (6.5.6) by its average value, i.e. by the mean ﬁeld ˜ hl = h + J(l − l )σl = h + J(0)m . (6.5.7) l

In the second part of this equation, we have introduced the average value m = σl ,

(6.5.8)

6.5 Ferromagnetism

291

which is position-independent, owing to the translational invariance of the Hamiltonian; thus m refers to the average magnetization per lattice site (per spin). Furthermore, we use the abbreviation ˜ ≡ J˜ ≡ J(0) J(l) (6.5.9) l

for the Fourier transform at k = 0 (see (6.5.5 )). Eq. (6.5.7) contains, in ˜ The density matrix addition to the external ﬁeld, the molecular ﬁeld Jm. then has the simpliﬁed form ˜ ρ∝ eσl (h+Jm)/kT . l

Formally, we have reduced the problem to that of a paramagnet, where the molecular ﬁeld must still be determined self-consistently from the magnetization (6.5.8). We still want to derive the molecular ﬁeld approximation, justiﬁed above with intuitive arguments, in a more formal manner. We start with an arbitrary interaction term in (6.5.4), −J(l − l )σl σl , and rewrite it up to a prefactor as follows: σl σl = σl + σl − σl σl + σl − σl = σl σl + σl σl − σl (6.5.10) + σl σl − σl + σl − σl σl − σl . Here, we have ordered the terms in powers of the deviation from the mean value. We now neglect terms which are nonlinear in these ﬂuctuations. This yields the following approximate replacements: σl σl → −σl σl + σl σl + σl σl ,

(6.5.10 )

which lead from (6.5.4) to the Hamiltonian in the molecular ﬁeld approximation 1 ˜ ˜ − HMFT = m2 N J(0) . (6.5.11) σl h + J(0)m 2 l

We refer to the Remarks for comments about the validity and admissibility of this approximation. With the simpliﬁed Hamiltonian (6.5.11), we obtain the density matrix P

−1 ρMFT = ZMFT eβ [

l

2 ˜ 1 ˜ σl (h+Jm)− 2 m JN ]

(6.5.12)

292

6. Magnetism

and the partition function ZMFT = Tr e

P 2 ˜ 1 ˜ β [ l σl (h+Jm)− 2 m JN]

=

)

l

*

e

˜ βσl (h+Jm)

e− 2 βm 1

2

˜ JN

σl =±1

(6.5.13) in the molecular ﬁeld approximation, where Tr ≡ for (6.5.13) 1 2 N ˜ ˜ ZMFT = e− 2 βm J 2 cosh β h + Jm .

{σl =±1} .

We thus ﬁnd

(6.5.13 )

∂ log ZMFT , we obtain the equation of state in the molecular Using m = N1 kT ∂h ﬁeld approximation: ˜ + h) , m = tanh β(Jm (6.5.14)

which is an implicit equation for m. Compared to the equation of state of ˜ a paramagnet, the ﬁeld h is ampliﬁed by the internal molecular ﬁeld Jm. As we shall see later, (6.5.14) can be solved analytically for h. It is however instructive to solve (6.5.14) ﬁrst for limiting cases. To do this, it will prove expedient to introduce the following abbreviations: Tc =

J˜ k

and τ =

T − Tc . Tc

(6.5.15)

We will immediately see that Tc has the signiﬁcance of the transition temperature, the Curie temperature. Above Tc , the magnetization is zero in the absence of an applied ﬁeld; below this temperature, it increases continuously with decreasing temperature to a ﬁnite value. We ﬁrst determine the behavior in the neighborhood of Tc , where we can expand in terms of τ, h and m. a) h = 0: For zero applied ﬁeld and in the vicinity of Tc , (6.5.14) can be expanded in a Taylor series, 3

1 Tc Tc ˜ m− m + ... m = tanh β J m = T 3 T

(6.5.16)

which can be cut oﬀ at the third order so as to retain the leading term of the solution. The solutions of (6.5.16) are m = 0 for

T > Tc

m = ±m0 ,

m0 =

(6.5.17a)

and √ 1/2 3(−τ )

for T < Tc .

(6.5.17b)

6.5 Ferromagnetism

293

The ﬁrst solution, m = 0, is found for all temperatures, the second only for T ≤ Tc , i.e. τ ≤ 0. Since the free energy of the second solution is smaller (see below and in Fig. 6.9), it is the stable solution below Tc . From these considerations we ﬁnd the temperature ranges given in (6.5.17). For T ≤ Tc , the spontaneous magnetization, denoted as m0 , is observed (6.5.17b); it follows a square-root law (Fig. 6.5). This quantity is called the order parameter of the ferromagnetic phase transition.

Fig. 6.5. The spontaneous magnetization (solid curve), and the magnetization in an applied ﬁeld (dashed). The spontaneous magnetization in the Ising model has two possible orientations, +m0 or −m0

b) h and τ nonzero: for small h and τ and thus small m, the expansion of (6.5.14)

3 Tc h 1 h Tc m 1− = − + m + ... , T kT 3 kT T leads to the magnetic equation of state h 1 = τ m + m3 kTc 3

(6.5.18)

in the neighborhood of Tc . An applied magnetic ﬁeld produces a ﬁnite magnetization even above Tc and leads qualitatively to the dashed curve in Fig. 6.5. c) τ = 0 : exactly at Tc , we ﬁnd from (6.5.18) the critical isotherm: 1/3

3h , h ∼ m3 . (6.5.19) m= kTc d) Susceptibility for small τ : we now compute the isothermal magnetic susceptibility χ = ∂m ∂h T , by differentiating the equation of state (6.5.18) with respect to h 1 = τ χ + m2 χ . kTc

(6.5.20)

In the limit h → 0, we can insert the spontaneous magnetization (6.5.17) into (6.5.20) and obtain for the isothermal magnetic susceptibility

294

6. Magnetism

Fig. 6.6. The magnetic susceptibility (6.5.21): the Curie–Weiss law

⎧ 1/k ⎪ ⎪ ⎪ ⎪ ⎨ (T − Tc )

1/kTc χ= = ⎪ τ + m2 ⎪ ⎪ ⎪ ⎩

T > Tc ;

1/k 2(Tc − T )

(6.5.21)

T < Tc

this is the Curie–Weiss law shown in Fig. 6.6. Remark: We can understand the divergent susceptibility at Tc by starting from the Curie law for paramagnetic spins (6.3.10a), adding the internal molecular ﬁeld J˜m to the ﬁeld h, and then determining the magnetization from it: m=

1 ˜ → m = 1/k . (h + Jm) kT h T − Tc

(6.5.22)

Following these limiting cases, we solve (6.5.14) generally. We ﬁrst discuss the graphical solution of this equation, referring to Fig. 6.7. e) A graphical solution of the equation m = tanh β(h + J˜m) To ﬁnd a graphical solution, it is expedient to introduce the auxiliary variable y = m + kThc . Then one ﬁnds m as a function of h by determining the intersection of the line y − kThc with tanh TTc y: m=y−

Tc h = tanh y . kTc T

For T ≥ Tc , Fig. 6.7a exhibits exactly one intersection for each value of h. This yields the monotonically varying curve for T ≥ Tc in Fig. 6.8. For T < Tc , from Fig. 6.7b the slope of tanh TTc y at y = 0 is greater than 1 and therefore we ﬁnd three intersections for small absolute values of h, while the solution for high ﬁelds remains unique. This leads to the function for T < Tc which is shown in Fig. 6.8.

6.5 Ferromagnetism

295

Fig. 6.7. The graphical solution of Eq. (6.5.14).

Fig. 6.8. The magnetic equation of state in the molecular ﬁeld approximation (6.5.23). The dotted vertical line on the maxis represents the inhomogeneous state (6.5.28)

For small h, m(h) is not uniquely determined. Particularly noticeable is the fact that the S-shaped curve m(h) contains a section with negative slope, i.e. negative susceptibility. In order to clarify the stability of the solution, we need to consider the free energy. We ﬁrst note that for large h, the magnetization approaches its saturation value (Fig. 6.7). In fact, one can immediately compute the function h(m) from Eq. (6.5.14) analytically, since from ˜ + h) = arctanh m ≡ β(Jm

1 1+m log 2 1−m

the equation of state 1+m kT log (6.5.23) 2 1−m follows. Its shape is shown in Fig. 6.8 for T ≶ Tc at the two values T = 0.8 Tc and 1.2 Tc taken as examples, in agreement with the graphical construction. h = −kTc m +

296

6. Magnetism

As mentioned above, for a given ﬁeld h, the value of the magnetization below Tc is not everywhere unique; e.g. for h = 0, the three values 0 and ±m0 occur. In order to ﬁnd out which parts of the equation of state are physically stable, we must investigate the free energy. The free energy in the molecular ﬁeld approximation, F = −kT log ZMFT , per lattice site and in units of the Boltzmann constant, is given from (6.5.13 ) by f (T, h) =

2 3 1 F = Tc m2 − T log 2 cosh (Tc m + h/k)/T Nk 2

(6.5.24)

1 Tc ≈ (T − Tc )m2 + m4 − mh/k − T log 2 . 2 12 We give here in the ﬁrst line the complete expression, and in the second line the expansion in terms of m, h and T − Tc, which applies in the neighborhood of the phase transition. Here, m = m(h) must still be inserted. From (6.5.24), the heat capacity at vanishing applied ﬁeld (for T ≈ Tc ) can be found: 0 T > Tc ∂ 2 f ch=0 = −N kT = 3 ; T ∂T 2 h=0 N k T < Tc 2 Tc here, a jump of magnitude ∆ch=0 = 32 N k is seen. We calculate directly the Helmholtz free energy a(T, m) = f + mh/k =

2 3 1 Tc m2 − T log 2 cosh (Tc m + h/k)/T + mh/k , 2

(6.5.25)

in which h = h(m) is to be inserted. From the determining equation for m (6.5.14), it follows that 2 3 T log 2 cosh (Tc m + h/k)/T = ) *1/2 1 = T log 2 + T log 1 − tanh2 (Tc m + h/k)/T = T log 2 −

T log(1 − m2 ) . 2

Combining this with (6.5.23) and inserting into (6.5.25), we obtain 1+m 1 Tm 1 log a(T, m) = − Tc m2 − T log 2 + T log(1 − m2 ) + 2 2 2 1−m (6.5.26) 1 Tc 4 2 ≈ −T log 2 + (Tc − T )m + m ; 2 12

6.5 Ferromagnetism

297

Fig. 6.9. The Helmholtz free energy in the molecular ﬁeld approximation above and below Tc , for T = 0.8 Tc and T = 1.2 Tc .

here, the second line holds near Tc . The Helmholtz free energy above and below Tc is shown in Fig. 6.9. We ﬁrst wish to point out the similarity of the free energy for T < Tc with that of the van der Waals gas. For temperatures T < Tc , there is a region in a(T, m) which violates the stability criterion (6.1.24a). The magnetization can be read oﬀ from Fig. 6.9 using

∂a h=k , (6.5.27) ∂m T by drawing a tangent with the slope h to the function a(T, m). Above Tc , this construction gives a unique answer; below Tc , however, it is unique only for a suﬃciently strong applied ﬁeld. We continue the discussion of the lowtemperature phase and determine the reorientation of the magnetization on changing the direction of the applied magnetic ﬁeld, starting with a magnetic ﬁeld h for which only a single value of the magnetization results from the tangent construction. Lowering the ﬁeld causes m to decrease until at h = 0, the value m0 is obtained. Exactly the same tangent, namely that with slope zero, applies to the point −m0 . Regions of magnetization m0 and −m0 can therefore be present in equilibrium with each another. When a fraction c of the body has the magnetization −m0 and a fraction 1 − c has the magnetization m0 , then for 0 ≤ c ≤ 1 the average magnetization is m = −cm0 + (1 − c)m0 = (1 − 2c)m0

(6.5.28)

in the interval between −m0 and m0 . The free energy of this inhomogeneously magnetized object is a(m0 ) (dotted line in Fig. 6.9), and is thus lower than the part of the molecular-ﬁeld so-

298

6. Magnetism

lution which arches upwards and which corresponds to a homogeneous state in the coexistence region of the two states +m0 and −m0 . In the interval [−m0 , m0 ], the system does not enter the homogeneous state with its higher free energy, but instead breaks up into domains24 which according to Eq. (6.5.28) yield all together the magnetization m. We remind the reader of the analogy to the Maxwell construction in the case of a van der Waals liquid. For completeness, we compare the free energies of the magnetization states belonging to a small but nonzero h. Without loss of generality we can assume that h is positive. Along with the positive magnetization, for small h there are also two solutions of (6.5.27) with negative magnetizations. It is clear from Fig. 6.9 that the latter two have higher free energies than the solution with positive magnetization. For a positive (negative) magnetic ﬁeld, the state with positive (negative) magnetization is thermodynamically stable. The Sshaped part of the equation of state (for T < Tc ) in Fig. 6.8 is thus replaced by the dotted vertical line. Finally, we give the entropy in the molecular ﬁeld approximation:

S ∂a 1+m 1−m 1−m 1+m s= =− log + log ; (6.5.29) =− Nk ∂T m 2 2 2 2 it depends only on the average magnetization m. The internal energy is given by e=

E 1 = a − mh/k + T s = − Tc m2 − mh/k . Nk 2

(6.5.30)

This can be more readily seen from (6.5.11) by taking an average value H it again follows that with the density matrix (6.5.12). From h = k ∂a(T,m) ∂m Tc m+h/k m = tanh , i.e. we recover Eq. (6.5.14). T Remarks: (i) The molecular ﬁeld approximation can also be applied to other models, for example the Heisenberg model, and also for quite diﬀerent cooperative phenomena. The results are completely analogous. (ii) The eﬀect of the remaining spins on an arbitrarily chosen spin is replaced in molecular ﬁeld theory be a mean ﬁeld. In the case of a short-range interaction, the real ﬁeld conﬁguration will deviate considerably from this mean value. The more long-ranged the interaction, the more spins contribute to the local ﬁeld, and the more closely it thus approaches the average ﬁeld. The 24

The number of domains can be greater than just two. When there are only a few domains, the interface energy is negligible in comparison to the gain in volume energy; see problem 7.6. In reality, the dipole interaction, anisotropies and inhomogeneities in the crystal play a role in the formation of domains. They form in such a way that the energy including that of the magnetic ﬁeld is minimized.

6.5 Ferromagnetism

299

molecular ﬁeld approximation is therefore exact in the limit of long-range interactions (see also problem 6.13, the Weiss model). We note here the analogy between the molecular ﬁeld theory and the Hartree-Fock theory of atoms and other many-body systems. (iii) We want to point out another aspect of the molecular ﬁeld approximation: its results do not depend at all on the dimensionality. This contradicts intuition and also exact calculations. In the case of short-range interactions, one-dimensional systems in fact do not undergo a phase transition; there are too few neighbors to lead to a cooperative ordering phenomenon. (iv) In the next chapter, we shall turn to a detailed comparison of the gasliquid transition and the ferromagnetic transition. We point out here in anticipation that the van der Waals liquid and the ferromagnet show quite similar behavior in the immediate vicinity of their critical points in the molecular 1/2 1/2 ﬁeld approximation; e.g. (ρG − ρc ) ∼ (Tc − T ) and M0 ∼ (Tc − T ) , and likewise, the isothermal compressibility and the magnetic susceptibility both −1 diverge as (Tc − T ) . This similarity is not surprising; in both cases, the interactions with the other gas atoms or spins is replaced by a mean ﬁeld which is determined self-consistently from the ensuing equation of state. (v) If one compares the critical power laws (6.5.17), (6.5.19), and (6.5.21) with experiments, with the exact solution of the two-dimensional Ising model, and with numerical results from computer simulations or series expansions, it is found that in fact qualitatively similar power laws hold, but the critical exponents are diﬀerent from those found in the molecular ﬁeld theory. The lower the dimensionality, the greater the deviations found. Instead of (6.5.17), (6.5.19), and (6.5.21), one ﬁnds generalized power laws: β

T < Tc ,

(6.5.31a)

1/δ

m0 ∼|τ | m ∼h

T = Tc ,

(6.5.31b)

χ ∼ |τ |

−γ

T ≷ Tc ,

(6.5.31c)

ch ∼ |τ |

−α

T ≷ Tc .

(6.5.31d)

The critical exponents β, δ, γ and α which occur in these expressions in general diﬀer from their molecular ﬁeld values 1/2, 3, 1 and 0 (corresponding to the jump). For instance, in the two-dimensional Ising model, β = 1/8, δ = 15, γ = 7/4, and α = 0 (logarithmic). Remarkably, the values of the critical exponents do not depend on the lattice structure, but only on the dimensionality of the system. All Ising systems with short-range forces have the same critical exponents in d dimensions. Here, we have an example of the so called universality. The critical behavior depends on only a very few quantities, such as the dimensionality of the system, the number of components of the order parameter and the symmetry of the Hamiltonian. Heisenberg ferromagnets have diﬀerent critical exponents from Ising ferromagnets, but within these groups, they are all the same. With these remarks about the actual behavior in the neighborhood of a critical

300

6. Magnetism

point, we will close the discussion. In particular, we postpone the description of additional analogies between phase transitions to the next chapter. We now return to the molecular ﬁeld approximation and use it to compute the magnetic susceptibility and the position-dependent spin correlation function. 6.5.3 Correlation Functions and Susceptibility In this subsection, we shall consider the Ising model in the presence of a spatially varying applied magnetic ﬁeld hl . The Hamiltonian is then given by H = H0 −

hl σl = −

l

1 J (l − l ) σl σl − hl σl . 2 l,l

(6.5.32)

l

The magnetization per spin at position l now depends on the lattice site l:

ml = σl ≡ Tr e−βH σl /Tr e−βH . (6.5.33) We ﬁrst deﬁne the susceptibility χ (xl , xl ) =

∂ml , ∂hl

(6.5.34)

which describes the response at the site l to a change in the ﬁeld at the site l . The correlation function is deﬁned as G (xl , xl ) ≡ σl σl − σl σl = (σl − σl )(σl − σl ) .

(6.5.35)

The correlation function (6.5.35) is a measure of how strongly the deviations from the mean values at the sites l and l are correlated with each other. Susceptibility and correlation function are related through the important ﬂuctuation-response theorem χ(xl , xl ) =

1 G (xl , xl ) . kT

(6.5.36)

This theorem (6.5.36) can be derived by taking the derivative of (6.5.33) with respect to hl . For a translationally invariant system, we have χ(xl , xl )|{hl =0} = χ (xl − xl )

and

G(xl , xl )|{hl =0} = G (xl − xl ) . (6.5.37)

At small ﬁelds hl , we ﬁnd (ml ≡ ml − m) χ (xl − xl ) hl . ml = l

(6.5.38)

6.5 Ferromagnetism

301

A periodic ﬁeld hl = hq eiqxl

(6.5.39)

therefore gives rise to a magnetization of the form χ (xl − xl ) e−iq(xl −xl ) hq = χ (q) eiqxl hq , ml = eiqxl

(6.5.40)

l

where χ (q) =

χ (xl − xl ) e−iq(xl −xl ) =

l

1 G (xl ) e−iqxl kT

(6.5.41)

l

is the Fourier transform of the susceptibility, and following the equals sign (6.5.36) has been inserted. In particular for q = 0, we ﬁnd the following relation between the uniform susceptibility and the correlation function: χ ≡ χ (0) =

1 G (xl ) . kT

(6.5.42)

l

Since the correlation function (6.5.35) can never be greater than 1, (|σl | = 1), and is in no case divergent, the divergence of the uniform susceptibility, Eq. (6.5.21) (i.e. the susceptibility referred to a spatially uniform ﬁeld) can only be due to the fact that the correlations at Tc attain an inﬁnitely long range. 6.5.4 The Ornstein–Zernike Correlation Function We now want to calculate the correlation function introduced in the previous section within the molecular ﬁeld approximation. As before, we denote the ﬁeld by hl , so that the mean value ml = σl is also site dependent. In the molecular-ﬁeld approximation, the density matrix is given by ρMFT = Z −1 exp β σl (hl + J(l − l )σl ) . (6.5.43) l

l

The Fourier transform of the exchange coupling, which we take to be shortranged, can be written for small wavenumbers as ˜ J(k) ≡

l

J(l)e−ikxl ≈ J˜ − k2

1 2 xl J(l) ≡ J˜ − k 2 J . 6

(6.5.44)

l

Here, we have replaced the exponential function by its Taylor series. Due to the mirror symmetry of a cubic lattice, J (−l) = J (l), and therefore there is 2 no linear term in k. Furthermore, we have (k · xl ) J (l) = 13 k 2 x2l J (l) . l

l

302

6. Magnetism

The constant J is deﬁned by J=

1 2 xl J(l) . 6

(6.5.45)

l

Using the density matrix (6.5.43), we obtain for the mean value of σl , in analogy to (6.5.14) in Sect. 6.5.2, the result σl = tanh β(hl + J(l − l )σl ) . (6.5.46) l

We now take the derivative ∂h∂ of the last equation (6.5.46),and ﬁnally set l all the hl = 0, obtaining for the susceptibility: 1

× cosh [ β l J(l − l )m] J(l − l ) χ(xl − xl ) × βδll + β

χ(xl − xl ) =

2

(6.5.47)

l

The Fourier-transformed susceptibility (6.5.41) is obtained from (6.5.47), recalling the convolution theorem: 1 ˜(q)χ(q) . β + β J (6.5.48) χ(q) = ˜ cosh2 β Jm 1 Furthermore, using cosh2 β J˜m = 1−tanh1 2 β Jm = 1−m 2 , where we have in˜ serted the determining equation for m, Eq. (6.5.16), we obtain the general result

χ(q) =

1 1−m2

β . − β J˜(q)

(6.5.49)

From this last equation, together with (6.5.15) and (6.5.44), we ﬁnd in the neighborhood of Tc : χ (q) =

β 1−

Tc T

+ m20 +

Jq2 kT

for T ≈ Tc

(6.5.50)

or also χ (q) =

1 , J (q 2 + ξ −2 )

where the correlation length

12 −1/2 τ J ξ= kTc (−2τ )−1/2

(6.5.50 )

T > Tc T < Tc

(6.5.51)

6.5 Ferromagnetism

303

has been introduced, with τ = (T − Tc ) /Tc. The susceptibility in real space is obtained by inverting the Fourier transform: 1 V d3 q χ(q) eiq(xl −xl ) . χ(xl − xl ) = χ(q)eiq(xl −xl ) = N q N (2π)3 (6.5.52) For the second equals sign it was assumed that the system is macroscopic, so that the sum over q can be replaced by an integral (cf. (4.1.2b) and (4.1.14a) with p/ → q) . To compute the susceptibility for large distances it suﬃces to make use of the result for χ(q) at small values of q (Eq. (6.5.50 )); then with the lattice constant a we ﬁnd a3 e−|xl −xl |/ξ a3 eiq(xl −xl ) = . (6.5.53) d3 q χ (xl − xl ) = 3 2 −2 J (q + ξ ) 4πJ|xl − xl | (2π) From χ calculated in this way, we ﬁnd the correlation function via (6.5.37): G (x) = kT χ (x) =

kT a3 e−|x|/ξ , 4πJ |x|

(6.5.53 )

which in this context is called the Ornstein–Zernike correlation function. The Ornstein–Zernike correlation function and its Fourier transform are shown in Fig. 6.10 and Fig. 6.11 for the temperatures T = 1.01 Tc and T = Tc . In these ﬁgures, the correlation length ξ at T = 1.01 Tc is also indicated. The quantity 1/2 ξ0 is deﬁned by ξ0 = (J/kTc ) , according to (6.5.48). At large distances χ(x) 1 −|x|/ξ decreases exponentially as |x| e . The correlation length ξ characterizes

Fig. 6.10. The Ornstein–Zernike correlation function for T = 1.01 Tc and for T = Tc . Distances are measured in units of ξ0 = (J/kTc )1/2 .

Fig. 6.11. The Fourier transform of the Ornstein–Zernike susceptibility for T = 1.01 Tc and for T = Tc . The reciprocal of the correlation length for T = 1.01 Tc is indicated by the arrow.

304

6. Magnetism

the typical length over which the spin ﬂuctuations are correlated. For |x| ξ, G (x) is practically zero. At Tc , ξ = ∞, and G (x) obeys the power law G (x) =

kTc v 4πJ |x|

(6.5.54)

with the volume of the unit cell v = a3 . χ(q) varies as 1/q 2 for ξ −1 q and for q = 0, it is identical with the Curie–Weiss susceptibility. On approaching Tc , χ (0) becomes larger and larger. We note further that the continuum theory and thus (6.5.50 ) and (6.5.52) apply only to the case when |x| a. An important experimental tool for the investigation of magnetic phenomena is neutron scattering. The magnetic moment of the neutron interacts with the ﬁeld produced by the magnetic moments in the solid and is therefore sensitive to magnetic structure and to static and dynamic ﬂuctuations. The elastic scattering cross-section is proportional to the static susceptibility χ(q). Here, q is the momentum transfer, q = kin − kout , where kin(out) are the wave numbers of the incident and scattered neutrons. The increase of χ(q) at small q for T → Tc leads to intense forward scattering. This is termed critical opalescence near the Curie temperature, in analogy to the corresponding phenomenon in light scattering near the critical point of the gas-liquid transition. The correlation length ξ diverges at the critical point; the correlations become more and more long-ranged as Tc is approached. Therefore, statistical ﬂuctuations of the magnetic moments are correlated with each other over larger and larger regions. Furthermore, a ﬁeld acting at the position x induces a polarization not only at that position, but also up to a distance ξ, as a result of (6.5.37). The increase of the correlations can also be recognized in the spin conﬁgurations illustrated in Fig. 6.12. Here, ‘snapshots’ from a computer simulation of the Ising model are shown. White pixels represent σ = +1 and black pixels are for σ = −1. At twice the transition temperature, the spins are correlated only over very short distances (of a few lattice constants). At T = 1.1 Tc , the increase of the correlation length is clearly recognizable.

Fig. 6.12. A ‘snapshot’ of the spin conﬁguration of a two-dimensional Ising model at T = 2 Tc , T = 1.1 Tc and T = Tc . White pixels represent σ = +1, and black pixels refer to σ = −1.

6.5 Ferromagnetism

305

Along with very small clusters, both the black and the white clusters can be made out up to the correlation length ξ (T = 1.1 Tc). At T = Tc , ξ = ∞. In the ﬁgure, one sees two large white and black clusters. If the area viewed were to be enlarged, it would become clear that these are themselves located within an even larger cluster, which itself is only a member of a still larger cluster. There are thus correlated regions on all length scales. We observe here a scale invariance to which we shall return later. When we enlarge the unit of length, the larger clusters become smaller clusters, but since there are clusters up to inﬁnitely large dimensions, the picture remains the same. The Ornstein–Zernike theory (6.5.51) and (6.5.53) reproduces the correct behavior qualitatively. The correlation length diverges however in reality as ξ = ξ0 τ −ν , where in general ν = 12 , and also the shape of G(x) diﬀers from (6.5.53 ) (see Chap. 7). ∗

6.5.5 Continuum Representation

6.5.5.1 Correlation Functions and Susceptibilities It is instructive to derive the results obtained in the preceding sections in a continuum representation. The formulas which occur in this derivation will also allow a direct comparison with the Ginzburg–Landau theory, which we will treat later (in Chap. 7). Critical anomalies occur at large wavelengths. In order to describe this region, it is suﬃcient and expedient to go to a continuum formulation: hl → h (x) , σl → σ (x) , ml → m(x) , Z 3 X d x h (x) σ (x) . hl σl → v l

(6.5.55)

Here, a is the lattice constant and v = a3 is the volume of the unit cell. The sum over l becomes an integral over x in the limit v → ∞. The partial derivative becomes a functional derivative25 (v → 0) 1 ∂ml δm (x) = δh (x ) v ∂hl

etc., e.g.

´ ` δh (x) = δ x − x . δh (x )

(6.5.56)

For the susceptibility and correlation function we thus obtain from (6.5.34) ` ´ ´ ` δm (x) 1 ∂ml χ x − x = v = = G x − x . δh (x ) ∂hl kT For small h (x), we ﬁnd Z 3 ´ ` ´ d x ` m (x) = χ x − x h x . v 25

(6.5.57)

(6.5.58)

The general deﬁnition of the functional derivative is to be found in W. I. Smirnov, A Course of Higher Mathematics, Vol. V, Pergamon Press, Oxford 1964 or in QM I, Sect. 13.3.1

306

6. Magnetism

A periodic ﬁeld ` ´ h x = hq eiqx

(6.5.59)

induces a magnetization of the form Z 3 ´ d x ` χ x − x e−iq(x−x ) hq = χ (q) eiqx hq , m (x) = eiqx v

(6.5.60)

where Z χ (q) =

d3 y 1 χ (y) e−iqy = v kT v

Z

d3 y e−iqy G (y)

(6.5.61)

is the Fourier transform of the susceptibility, and after the second equals sign, we have made use of (6.5.37). In particular, for q = 0, we ﬁnd the following relation between the uniform susceptibility and the correlation function: Z 1 (6.5.62) d3 y G (y) . χ ≡ χ (0) = kT v

6.5.5.2 The Ornstein–Zernike Correlation Function As before, the ﬁeld h (x) and with it also the mean value σ(x) are position dependent. The density matrix in the molecular ﬁeld approximation and in the continuum representation is given by: «– » Z 3 „ Z 3 ´ ˙ ` ´¸ d x d x ` . (6.5.63) σ (x) h (x) + J x − x σ x ρMFT = Z −1 exp β v v The Fourier transform of the exchange coupling for small wavenumbers assumes the form Z 3 Z 3 d x d x 2 1 J˜ (k) = (6.5.64) J (x) e−ik·x ≈ J˜ − k2 x J(x) ≡ J˜ − k2 J , v 6 v where the exponential function has been replaced by its Taylor expansion. Owing to the spherical symmetry of Rthe exchange interactionR J(x) ≡ J(|x|), there is no linear term in k and we ﬁnd d3 x (kx)2 J (x) = 13 k2 d3 x x2 J (x). The constant R 3 1 d x x2 J (x). The inverse transform of (6.5.64) yields J is deﬁned by J = 6v ” “ (6.5.65) J (x) = v J˜ + J∇ 2 δ (x) . For phenomena at small k or large distances, the real position dependence of the exchange interaction can be replaced by (6.5.65). We insert this into (6.5.63) and obtain the mean value of σ (x), analogously to (6.5.14) in Sect. 6.5.2: ”i h “ . (6.5.66) σ (x) = tanh β h (x) + J˜ σ (x) + J∇2 σ(x) In the neighborhood of Tc , we can carry out an expansion similar to that in (6.5.16), m (x) ≡ σ (x), τ m (x) −

J 1 h (x) ∇2 m (x) + m (x)3 = , kTc 3 kTc

(6.5.67)

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

307

with τ = (T − Tc ) /Tc , where the second term on the left-hand side occurs due to the spatial inhomogeneity of the magnetization. The equations of the continuum limit can be obtained from the corresponding equations of the discrete representation at any step, e.g. (6.5.67) follows from ” by carrying out the substitutions “ (6.5.46), σl = ml → m (x) , J (l) → J (x) = J˜ + J∇2 δ (x). δ Now we take the functional derivative δh(x ) of the last equation, (6.5.67), » – ´ ` ´ ` J ∇2 + m20 χ x − x = vδ x − x /kTc . (6.5.68) τ− kTc

Since the susceptibility is calculated in the limit h → 0, the spontaneous magnetization m0 , which is given by the molecular-ﬁeld expressions (6.5.17a,b), appears on the left side. The solution of this diﬀerential equation, which also occurs in connection with the Yukawa potential, is given in three dimensions by ` ´ v e−|x−x |/ξ χ x − x = . 4πJ |x − x |

(6.5.69)

The Fourier transform is 1 χ (q) = . J (q 2 + ξ −2 )

(6.5.70)

In this expression, we have introduced the correlation length: «1/2 ( −1/2 „ τ T > Tc J ξ= −1/2 kTc (−2τ ) T < Tc .

(6.5.71)

The results thus obtained agree with those of the previous section; for their discussion, we refer to that section.

∗

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields 6.6.1 The Hamiltonian In this section, we investigate the inﬂuence of the dipole interaction. The total Hamiltonian for the magnetic moments µl is given by µl Ha . (6.6.1) H ≡ H0 ({µl }) + Hd ({µl }) − l

H0 contains the exchange interaction between the magnetic moments and Hd represents the dipole interaction 1 αβ α β A µ µ Hd = 2 ll l l l,l ) * (6.6.2) 3(xl − xl )α (xl − xl )β δαβ 1 α β µ = − µ , l l 5 2 |xl − xl |3 |xl − xl | l,l

and Ha is the externally applied magnetic ﬁeld. The dipole interaction is long-

308

6. Magnetism

ranged, in contrast to the exchange interaction; it decreases as the third power of the distance. Although the dipole interaction is in general considerably weaker than the exchange interaction – its interaction energy corresponds to a temperature of about26 1 K – it plays an important role for some phenomena due to its long range and also due to its anisotropy. The goal of this section is to obtain predictions about the free energy and its derivatives for the Hamiltonian (6.6.1), F (T, Ha ) = −kT log Tr e−H/kT

(6.6.3)

and to analyze the modiﬁcations which result from including the dipole interaction. Before we turn to the microscopic theory, we wish to derive some elementary consequences of classical magnetostatics for thermodynamics; their justiﬁcation within the framework of statistical mechanics will be given at the end of this section. 6.6.2 Thermodynamics and Magnetostatics 6.6.2.1 The Demagnetizing Field It is well known from electrodynamics27 (magnetostatics) that in a magnetized body, in addition to the externally applied ﬁeld Ha , there is a demagnetizing ﬁeld Hd which results from the dipole ﬁelds of the individual magnetic moments, so that the eﬀective ﬁeld in the interior of the magnet, Hi , Hi = Ha + Hd ,

(6.6.4a)

is in general diﬀerent from Ha . The ﬁeld Hd is uniform only in ellipsoids and their limiting shapes, and we will thus limit ourselves as usual to this type of bodies. For ellipsoids, the demagnetizing ﬁeld has the form Hd = −D M and thus the (macroscopic) ﬁeld in the interior of the body is Hi = H a − D M .

(6.6.4b)

Here, D is the demagnetizing tensor and M is the magnetization (per unit volume). When Ha is applied along one of the principal axes, D can be interpreted as the appropriate demagnetizing factor in Eq. (6.6.4b). For Ha and therefore M parallel to the axis of a long cylindrical body, D = 0; for Ha and M perpendicular to an inﬁnitely extended thin sheet, D = 4π; and for a sphere, D = 4π 3 . The value of the internal ﬁeld thus depends on the shape of the sample and the direction of the applied ﬁeld. 26

27

See e.g. the estimate in N. W. Ashcroft and N. D. Mermin, Solid State Physics, Holt, Rinehart and Winston, New York, 1976, p. 673. A. Sommerfeld, Electrodynamics, Academic Press, New York 1952; R. Becker and F. Sauter, Theorie der Elektrizit¨ at, Vol. 1, 21st Edition, p. 52, Teubner, Stuttgart, 1973; R. Becker, Electromagnetic Fields and Interactions, Blaisdell, 1964; J. D. Jackson, Classical Electrodynamics, 2nd edition, John Wiley, New York, 1975.

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

309

6.6.2.2 Magnetic Susceptibilities We now need to distinguish between the susceptibility relative to the applied ∂M ﬁeld, χa (Ha ) = ∂H , and the susceptibility relative to the internal ﬁeld, a ∂M χi (Hi ) = ∂Hi . We consider for the moment only ﬁelds in the direction of the principal axes, so that we do not need to take the tensor character of the susceptibilities into account. We emphasize that the usual deﬁnition in electrodynamics is the second one. This is due to the fact that χi (Hi ) is a pure materials property28 , and that owing to curl Hi = 4π c j, the ﬁeld Hi can be controlled in the core of a coil by varying the current density j. Taking the derivative of Eq. (6.6.4b) with respect to M , one obtains the relation between the two susceptibilities: 1 1 = −D . χi (Hi ) χa (Ha )

(6.6.5a)

It is physically clear that the susceptibility χi (Hi ) relative to the internal ﬁeld Hi acting in the interior of the body is a speciﬁc materials parameter which is independent of the shape, and that therefore the shape dependence of χa (Ha ) χa (Ha ) =

χi (Hi ) 1 + Dχi (Hi )

(6.6.5b)

results form the occurrence of D in (6.6.5b) and (6.6.4b).29 If the ﬁeld is not applied along one of the principal axes of the ellipsoid, one can derive the tensor relation by taking the derivative of the component α of (6.6.4b) with respect to Mβ : −1 χi αβ = χ−1 − Dαβ . (6.6.5c) a αβ Relations of the type (6.6.5a–c) can be found in the classical thermodynamic literature.30

28

29

30

In the literature on magnetism, χi (Hi ) is called the true susceptibility and χa (Ha ) the apparent susceptibility. E. Kneller, Ferromagnetismus, Springer, Berlin, 1962, p. 97. When χi 10−4 , as in many practical situations, the demagnetization correction can be neglected. On the other hand, there are also cases in which the shape of the object can become important. In paramagnetic salts, χi increases at low temperatures according to Curie’s law, and it can become of the order of 1; in superconductors, 4πχi = −1 (perfect diamagnetism or Meissner eﬀect). R. Becker and W. D¨ oring, Ferromagnetismus, Springer, Berlin, 1939, p. 8; A. B. Pippard, Elements of Classical Thermodynamics, Cambridge at the University Press 1964, p. 66.

310

6. Magnetism

6.6.2.3 Free Energies and Speciﬁc Heats Starting from the free energy F (T, Ha ) with the diﬀerential dF = −SdT − V MdHa ,

(6.6.6)

we can deﬁne a new free energy by means of a Legendre transformation V Fˆ (T, Hi ) = F (T, Ha ) + M α Dαβ M β . 2 The diﬀerential of this free energy is, using (6.6.4b), given by dFˆ (T, Hi ) = −SdT − V MdHi .

(6.6.7a)

(6.6.7b)

Since the entropy S(T, Hi ) and the magnetization M(T, Hi ) as functions of the internal ﬁeld must be independent of the shape of the sample, all the derivatives of Fˆ (T, Hi ) are shape independent. Therefore, the free energy Fˆ (T, Hi ) is itself shape independent. From(6.6.6) and (6.6.7b), it follows that

ˆ ∂F ∂F S=− =− (6.6.8) ∂T Ha ∂T Hi and 1 M =− V

∂F ∂Ha

T

1 =− V

∂ Fˆ ∂Hi

.

(6.6.9)

T

The speciﬁc heat can also be deﬁned for a constant internal ﬁeld „ « T ∂S CH i = V ∂T Hi and for a constant applied (external) ﬁeld „ « T ∂S . CHa = V ∂T Ha

(6.6.10a)

(6.6.10b)

Using the Jacobian as in Sect. 3.2.4, one can readily obtain the following relations CH a = CH i

1 1 + DχiT ` ∂M ´

CHi = CHa + T and

∂T

Ha

D

∂T

Ha

(6.6.11b)

1 − DχaT ` ∂M ´

CH a = C H i − T

(6.6.11a) ` ∂M ´

∂T

Hi

D

` ∂M ´ ∂T

1 + DχiT

Hi

,

(6.6.11c)

where the index T indicates the isothermal susceptibility. The shape independence of χi (Hi ) and Fˆ (T, Hi ), which is plausible for the physical reasons given above, has also been derived using perturbation-theoretical methods.31 For a vanishingly small ﬁeld, the shape-independence could be proven without resorting to perturbation theory.32 31 32

P. M. Levy, Phys. Rev. 170, 595 (1968); H. Horner, Phys. Rev. 172, 535 (1968) R. B. Griﬃths, Phys. Rev.176, 655 (1968)

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

311

6.6.2.4 The Local Field Along with the internal ﬁeld, one occasionally also requires the local ﬁeld Hloc . It is the ﬁeld present at the position of a magnetic moment. One obtains it by imagining a sphere to be centered on the lattice site under consideration, which is large compared to the unit cell but small compared to the overall ellipsoid (see Fig. 6.13). We obtain for the local ﬁeld:27 Hloc = Ha + φM 4π −D . with φ = φ0 + 3

(6.6.12a) (6.6.12b)

Here, φ0 is the sum of the dipole ﬁelds of the average moments within the ﬁctitious sphere. The medium outside the imaginary sphere can be treated as a continuum, and its contribution is that of a solid polarized ellipsoid (−D), minus that of a polarized sphere 4π 3 . For a cubic lattice, φ0 vanishes for reasons of symmetry.27 One can also introduce a free energy 1 ˆ Fˆ (T, Hloc ) = F (T, Ha ) − V M φM 2

(6.6.13a)

with the diﬀerential ˆ dFˆ = −SdT − V M dHloc . Since, owing to (6.6.12a,b), (6.6.7a), and (6.6.13a), it follows that

1 4π ˆ M, Fˆ (T, Hloc ) = Fˆ (T, Hi ) + V M φ0 + 2 3

(6.6.13b)

(6.6.14)

ˆ so that Fˆ diﬀers from Fˆ only by a term which is independent of the external shape and is itself therefore shape-independent. One can naturally also

Fig. 6.13. The deﬁnition of the local ﬁeld. An ellipsoid of volume V and a ﬁctitious sphere of volume V0 (schematic, not to scale)

312

6. Magnetism

deﬁne susceptibilities at constant Hloc and derive relations corresponding to equations (6.6.13a-c) and (6.6.11a-c), in which essentially Hi is replaced by Hloc and D by φ. 6.6.3 Statistical–Mechanical Justiﬁcation In this subsection, we will give a microscopic justiﬁcation of the thermodynamic results obtained in the preceding section and derive Hamiltonians for the calculation of the shape-independent free energies Fˆ (T, Hi ) and ˆ Fˆ (T, Hloc ) of equations (6.6.7a) and (6.6.13a). The magnetic moments will be represented by the their mean values and ﬂuctuations. The dipole interaction will be decomposed into a short-range and a long-range part. For the interactions of the ﬂuctuations, the long-range part can be neglected. The starting point will be the Hamiltonian (6.6.1), in which we introduced the ﬂuctuations around (deviations from) the mean value µα l α α δµα l ≡ µl − µl :

(6.6.15)

1 αβ α β 1 αβ α β A δµl δµl + A µl µl 2 ll 2 ll l,l l,l αβ β α α All δµα µl Ha + l µl −

H = H0 ({µl }) +

l,l

= H0 ({µl }) + −

1 2

α µα l (Ha

l β α Aαβ ll δµl δµl −

l,l

+

1 αβ α β A µl µl 2 ll l,l

α Hd,l )

(6.6.16)

l

with the thermal average of the ﬁeld at the lattice point l due to the remaining dipoles: αβ β α (6.6.17) =− All µl . Hd,l l

For V βin an external magnetic ﬁeld, the magnetization is uniform, βellipsoids M ; likewise the dipole ﬁeld (demagnetizing ﬁeld): µl = N α α = Hloc ≡ (φ0 + D0 − D)αβ M β . Hd,l

(6.6.18)

In going from (6.6.17) to (6.6.18), the dipole sum φαβ = −

V αβ A = (φ0 + D0 − D)αβ N ll l

(6.6.19)

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

313

was decomposed into a discrete sum over the subvolume V0 (the Lorentz sphere) and the region V − V0 , in which a continuum approximation can be applied: ∂ ∂ 1 (D0 − D)αβ = − d3 x ∂x α ∂xβ |x| V −V0

(6.6.20) ∂ 1 ∂ 1 − . = δαβ dfα dfα ∂xβ |x| ∂xβ |x| S1 S2 The ﬁrst surface integral extends over the surface of the Lorentz sphere and the second over the (external) surface of the ellipsoid (sample). With this, we can write the Hamiltonian in the form 1 αβ α β α α 1 H = H0 ({µl }) + A δµl δµl − µl Hloc + V Mα φαβ Mβ . 2 ll 2 l,l

l

(6.6.21) Since the long-range property of the dipole interaction plays no role in the interaction between the ﬂuctuations δµl , the ﬁrst two terms in the Hamiltonian are shape-independent. The sample shape enters only in the local ﬁeld Hloc and in the fourth term on the right-hand side. Comparison with (6.6.13a) ˆ shows that the free energy Fˆ (T, Hloc ), which, apart from its dependence on Hloc , is shape independent, can be determined by computation of the partition function with the ﬁrst three terms of (6.6.21). If the dipole interactions between the ﬂuctuations is completely neglected,33 one obtains the approximate eﬀective Hamiltonian X ˆ ˆ = H 0 ({µ }) − µl Hloc , (6.6.22) H l l

in which the dipole interaction expresses itself only in the demagnetizing ﬁeld.

β α The exact treatment of the second term, 12 l,l Aαβ ll δµl δµl in (6.6.21) is carried out as follows: since the expectation value based on approximate −r /ξ application of the Ornstein–Zernike theory decreases as δµl δµl ≈ e rll , and All ∼ r13 , the interaction of the ﬂuctuations is negligible at large disll tances. The shape of the sample thus plays no role in this term in the limit V → ∞ with the shape kept unchanged. One can thus replace Aαβ ll by σ αβ All

=

∂ ∂ e−σ|x| , ∂xα ∂xβ |x|

(6.6.23)

with the cutoﬀ length σ −1 , or more precisely 1 αβ α β 1 σ αβ α β All δµl δµl = lim lim All δµl δµl . σ→0 V →∞ 2 2 l,l

33

l,l

J. H. van Vleck, J. Chem. Phys. 5, 320, (1937), Eq. (36).

(6.6.24)

314

6. Magnetism

Inserting δµl = µl − µl , we obtain for the right-hand side of (6.6.24) β α β 1 σ αβ α β µl All µl µl − 2µα l µl + µl σ→0 V →∞ 2 lim lim

l,l

1 σ αβ α β All µl µl + σ→0 V →∞ 2

= lim lim +

(6.6.25)

l,l

(φ0 + D0 )αβ M β µα l −

l

V (φ0 + D0 )M 2 . 2

In the order: ﬁrst the thermodynamic limit V → ∞, then σ → 0, the ﬁrst term in (6.6.25) is shape-independent. Since in the second and third terms, the sum over l is cut oﬀ by e−|xl −xl |σ , the contribution −D due to the external boundary of the ellipsoid does not appear here. Inserting (6.6.24) and (6.6.25) into (6.6.21), we ﬁnd the Hamiltonian in ﬁnal form34 ˆ − H=H

V M DM 2

(6.6.26a)

with ˆ = H0 ({µ }) + H l

d3 q

αβ α β 3 va Aq µq µ−q

(2π)

−

α µα l Hi .

(6.6.26b)

l

Here, the Fourier transforms 1 −iqxl α e µl , µα q = √ N l Aαβ e−iq(xl −xl ) Aαβ q = l0

(6.6.27a) (6.6.27b)

l =0

and the internal ﬁeld Hi = Ha − DM have been introduced. The Fourier transform (6.6.27b) can be evaluated using the Ewald method 35 ; for cubic lattices, it yields36

1 4π 3q α q β αβ αβ α β 2 α 2 δ − + α δ αβ Aq = q q + α q − α (q ) 1 2 3 va 3 q2 2 4 2 , (6.6.27b ) + O q 4 , (q α ) , (q α ) (q β ) where va is the volume of the primitive unit cell and the αi are constants ˆ Eq. (6.6.26b), which depend on the lattice structure. The ﬁrst two terms in H, 34 35 36

See also W. Finger, Physica 90 B, 251 (1977). P. P. Ewald, Ann. Phys. 54, 57 (1917); ibid., 54, 519 (1917); ibid., 64, 253 (1921) M. H. Cohen and F. Keﬀer, Phys. Rev. 99, 1135 (1955); A. Aharony and M. E. Fisher, Phys. Rev. B 8, 3323 (1973)

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

315

are shape-independent. The sample shape enters only into the internal ﬁeld Hi and in the last term of (6.6.26a). Comparison of Eq. (6.6.26a) with Eq. (6.6.7a) shows that the shape-independent free energy Fˆ (T, Hi ) can ˆ be calculated from the partition function derived from the Hamiltonian H, Eq. (6.6.26b). We note in particular the nonanalytic behavior of the term qα qβ /q 2 in the limit q → 0; it is caused by the 1/r3 -dependence of the dipole interaction. Due to this term, the longitudinal and transverse wavenumberdependent susceptibilities (with respect to the wavevector) are diﬀerent from each other.37 We recall that the short-ranged exchange interaction can be expanded as a Taylor series in q: 1 ˜ d3 q J(q)µ H0 = − q µ−q 2 (6.6.28) ˜ J(q) = J˜ − Jq2 + O(q 4 ) . In addition to the eﬀects of the demagnetizing ﬁeld and the resulting shape dependence, which we have treated in detail, the dipole interaction, even though it is in general much weaker than the exchange interaction, has a number of important consequences owing to its long range and its anisotropic character:37 (i) It changes the values of the critical exponents in the neighborhood of ferromagnetic phase transitions; (ii) it can stabilize magnetic order in systems of low dimensionality, which otherwise would not occur due to the large thermal ﬂuctuations; (iii) the total magnetic moment µ = l µl is no longer conserved. This has important consequences for the dynamics; and (iv) the dipole interaction is important in nuclear magnetism, where it is larger than or comparable to the indirect exchange interaction. We can now include the dipole interactions in the results of Sects. 6.1 to 6.5 in the following manner: (i) If we neglect the dipole interaction between the ﬂuctuations of the magnetic moments δµl = µl − µl as an approximation, we can take the spatially uniform part of the dipole ﬁelds into account by replacing the ﬁeld H by the local ﬁeld Hloc . (ii) If, in addition to the exchange interactions possibly present, we also include the dipole interaction between the ﬂuctuations, then according to (6.6.26), the complete Hamiltonian contains the internal ﬁeld Hi . The ﬁeld H must therefore be replaced by Hi ; furthermore, the shapedependent term − V2 M DM enters into the Hamiltonian H, Eq. (6.6.26a), ˆ also the shape-independent part of the dipole inand, via the term H, teraction, i.e. Eq. (6.6.27b ).

37

E. Frey and F. Schwabl, Advances in Physics 43, 577 (1994)

316

6. Magnetism

6.6.4 Domains The spontaneous magnetization per spin, m0 (T ), is shown in Fig. 6.5. The total magnetic moment of a uniformly magnetized sample without an external ﬁeld would be N m0 (T ), and its spontaneous magnetization per unit volume M0 (T ) = N m0 (T )/V , where N is the overall number of magnetic moments. In fact, as a rule the magnetic moment is smaller or even zero. This results from the fact that a sample in general breaks up into domains with diﬀerent directions of magnetization. Within each domain, |M(x, T )| = M0 (T ). Only when an external ﬁeld is applied do the domains which are oriented parallel to the ﬁeld direction grow at the cost of the others, and reorientation occurs until ﬁnally N m0 (T ) has been reached. The spontaneous magnetization is therefore also called the saturation magnetization. We want to illustrate domain formation, making use of two examples. (i) One possible domain structure in a ferromagnetic bar below Tc is shown in Fig. 6.14. One readily sees that for the conﬁguration with 45◦ -walls throughout the sample, div M = 0 .

(6.6.29)

Then it follows from the basic equations of magnetostatics div Hi = −4π div M curl Hi = 0

(6.6.30a) (6.6.30b)

that, in the interior of the sample, Hi = 0 ,

(6.6.31)

and thus also B = 4πM in the interior. From the continuity conditions it follows that B = H = 0 outside the sample. The domain conﬁguration is therefore energetically more favorable than a uniformly magnetized sample. (ii) Domain structures also express themselves in a measurement of the total magnetic moment M of a sphere. The calculated magnetization M = M V as a function of the applied ﬁeld is indicated by the curves in Fig. 6.15.

450

Fig. 6.14. The domain structure in a prism-shaped sample

6.7 Applications to Related Phenomena

317

Fig. 6.15. The magnetization within a sphere as a function of the external ﬁeld Ha , T1 < T2 < Tc ; D is the demagnetizing factor.

Let the magnetization within a uniformly magnetized region as a function of the internal ﬁeld Hi = Ha − DM be given by the function M = M (Hi ). As long as the overall magnetization of the sphere is less than the saturation magnetization, the domains have a structure such that Hi = 0, and 1 therefore, M = D Ha must hold.38 For Ha = DMspont , the sample is ﬁnally uniformly magnetized, corresponding to the saturation magnetization. For Ha > DMspont , M can be calculated from M = M (Ha − DM ).

6.7 Applications to Related Phenomena In this section, we discuss consequences of the results of this chapter on magnetism for other areas of physics: polymer physics, negative temperatures and the melting curve of 3 He. 6.7.1 Polymers and Rubber-like Elasticity Polymers are long chain molecules which are built up of similar links, the monomers. The number of monomers is typically N ≈ 100, 000 . Examples of polymers are polyethylene, (CH2 )N , polystyrene, (C8 H8 )N , and rubber, (C5 H8 )N , where the number of monomers is N > 100, 000 (see Fig. 6.16).

Fig. 6.16. The structures of polyethylene and polystyrene

38

S. Arajs and R. V Calvin, J. Appl. Phys. 35, 2424 (1964).

318

6. Magnetism

To ﬁnd a description of the mechanical and thermal properties we set up the following simple model (see Fig. 6.17): the starting point in space of monomer 1 is denoted by X1 , and that of a general monomer i by Xi . The position (orientation) of the ith monomer is then given by the vector Si ≡ Xi+1 − Xi : S1 = X2 − X1 , . . . , Si = Xi+1 − Xi , . . . , SN = XN +1 − XN .

(6.7.1)

We now assume that aside from the chain linkage of the monomers there are no interactions at all between them, and that they can freely assume any arbitrary orientation, i.e. < Si · Sj >= 0 for i = j. The length of a monomer is denoted by a, i.e. S2i = a2 .

Fig. 6.17. A polymer, composed of a chain of monomers

Since the line connecting the two ends of the polymer can be represented in the form XN +1 − X1 =

N

Si ,

(6.7.2)

i=1

it follows that XN +1 − X1 = 0 .

(6.7.3)

Here, we average independently over all possible orientations of the Si . The last equation means that the coiled polymer chain is oriented randomly in space, but makes no statement about its typical dimensions. A suitable measure of the mean square length is 2 2 (XN +1 − X1 ) = = a2 N . Si (6.7.4) We deﬁne the so called radius of gyration 1 2 (XN +1 − X1 ) = aN 2 , R≡

(6.7.5)

which characterizes the size of the polymer coil that grows as the square root of the number of monomers.

6.7 Applications to Related Phenomena

319

In order to study the elastic properties, we allow a force to act on the ends of the polymer, i.e. the force F acts on XN +1 and the force −F on X1 (see Fig. 6.17). Under the inﬂuence of this tensile force, the energy depends on the positions of the two ends: H = −(XN +1 − X1 ) · F = − [(XN +1 − XN ) + (XN − XN −1 ) + . . . + (X2 − X1 )] · F = −F ·

N

Si .

(6.7.6)

i=1

Polymers under tension can therefore be mapped onto the problem of a paramagnet in a magnetic ﬁeld, Sect. 6.3. The force corresponds to the applied magnetic ﬁeld in the paramagnetic case, and the length of the polymer chain to the magnetization. Thus, the thermal average of the distance vector between the ends of the chain is L=

N

Si

i=1

kT F aF − . = N a coth kT aF F

(6.7.7)

We have used the Langevin function for classical moments in this expression, Eq. (6.3.12b), and multiplied by the unit vector in the direction of the force, F/F . If aF is small compared to kT , we ﬁnd (corresponding to Curie’s law) L=

N a2 F. 3kT

(6.7.8)

For the change in the length, we obtain from the previous equation ∂L 1 ∼ ∂F T

(6.7.9a)

∂L N a2 =− |F| . ∂T 3kT 2

(6.7.9b)

and

The length change per unit force or the elastic constant decreases with increasing temperature according to (6.7.9a). A still more spectacular result is ∂L that for the expansion coeﬃcient ∂T : rubber contracts when its temperature is increased! This is in complete contrast to crystals, which as a rule expand with increasing temperature. The reason for the elastic behavior of rubber is easy to see: the higher the temperature, the more dominant is the entropy term in the free energy, F = E − T S, which strives towards a minimum. The entropy increases, i.e. the polymer becomes increasingly disordered or coiled and therefore pulls together. The general dependence of the length on a|F|/kT is shown in Fig. 6.18.

320

6. Magnetism

Fig. 6.18. The length of a polymer under the inﬂuence of a tensile force F.

Remark: In the model considered here, we have not taken into account that a monomer has a limited freedom of orientation, since each position can be occupied by at most one monomer. In a theory which takes this eﬀect into account, the dependence R = aN 1/2 in Eq. (6.7.5) is replaced by R = aN ν . The exponent ν has a signiﬁcance analogous to that of the exponent of the correlation length in phase transitions, and the degree of polymerization (chain length) N corresponds to the reciprocal distance from the critical point, τ −1 . The properties of polymers, in which the volume already occupied is excluded, correspond to a random motion in which the path cannot lead to a point already passed through (self-avoiding random walk). The properties of both these phenomena follow from the n-component φ4 model (see Sect. 7.4.5) in the limit n → 0.39 An approximate formula for ν is due to Flory: νFlory = 3/(d + 2). 6.7.2 Negative Temperatures In isolated systems whose energy levels are bounded above and below, thermodynamic states with negative absolute temperatures can be established. Examples of such systems with energy levels that are bounded towards higher energies are two-level systems or paramagnets in an external magnetic ﬁeld h. We consider a paramagnet consisting of N spins of quantum number S = 1/2 with an applied ﬁeld along the z direction. Considering the quantum numbers of the Pauli spin matrices σl = ±1, the Hamiltonian has the following diagonal structure H = −h σl . (6.7.10) l

The magnetization per lattice site is deﬁned by m = σ and is independent of the lattice position l. The entropy is given by 39

P.-G. de Gennes, Scaling Concepts in Polymer Physics, Cornell University Press, Ithaca, 1979.

6.7 Applications to Related Phenomena

321

1+m 1−m 1−m 1+m log + log S(m) = −kN 2 2 2 2 N+ N− + N− log , = −k N+ log N N

(6.7.11)

and the internal energy E depends on the magnetization via E = −N hm = −h(N+ − N− ) ,

(6.7.12)

with N± = N (1 ± m)/2. These expressions follow immediately from the treatment in the microcanonical ensemble (Sect. 2.5.2.2) and can also be obtained from Sect. 6.3 by elimination of T and B. For m = 1 (all spins parallel to the ﬁeld h), the energy is E = −N h; for m = −1 (all spins antiparallel to h), the energy is E = N h. The entropy is given in Fig. 2.9 as a function of the energy. It is maximal for E = 0, i.e. in the state of complete disorder. The temperature is obtained by taking the derivative of the entropy with respect to the energy: 1 1+m 2h T = ∂S = log k 1 −m ∂E h

−1

.

(6.7.13)

It is shown as a function of the energy in Fig. 2.10. In the interval 0 < m ≤ 1, i.e. −1 ≤ E/N h < 0, the temperature is positive, as usual. For m < 0, that is when the magnetization is oriented antiparallel to the magnetic ﬁeld, the absolute temperature becomes negative, i.e. T < 0 ! With increasing energy, the temperature T goes from 0 to ∞, then through −∞, and ﬁnally to −0. Negative temperatures thus belong to higher energies, and are therefore “hotter” than positive temperatures. In a state with a negative temperature, more spins are in the excited state than in the ground state. One can also see that negative temperatures are in fact hotter than positive by bringing two such systems into thermal contact. Take system 1 to have the positive temperature T1 > 0 and system 2 the negative temperature T2 < 0. We assume that the exchange of energy takes place quasistatically; then the total entropy is S = S1 (E1 ) + S2 (E2 ) and the (constant) total energy is E = dE1 2 E1 + E2 . From the increase of entropy, it follows with dE dt = − dt that

∂S1 dE1 ∂S2 dE2 1 1 dE1 dS = + = . (6.7.14) − 0< dt ∂E1 dt ∂E2 dt T1 T2 dt 1 Since the factor in brackets, T11 + |T12 | , is positive, dE dt > 0 must also hold; this means that energy ﬂows from subsystem 2 at a negative temperature into subsystem 1. We emphasize that the energy dependence of S(E) represented in Fig. 2.9 and the negative temperatures which result from it are a direct consequence of the boundedness of the energy levels. If the energy levels were not bounded

322

6. Magnetism

from above, then a ﬁnite energy input could not lead to an inﬁnite temperature or even beyond it. We also note that the speciﬁc heat per lattice site of this spin system is given by C = Nk

2h kT

2

e2h/kT 1 + e2h/kT

2

(6.7.15)

and vanishes both at T = ±0 as well as at T = ±∞. We now discuss two examples of negative temperatures: (i) Nuclear spins in a magnetic ﬁeld: The ﬁrst experiment of this kind was carried out by Purcell and Pound40 in a nuclear magnetic resonance experiment using the nuclear spins of 7 Li in LiF. The spins were ﬁrst oriented at the temperature T by the ﬁeld H. Then the direction of H was so quickly reversed that the nuclear spins could not follow it, that is faster than a period of the nuclear spin precession. The spins are then in a state with the negative temperature −T . The mutual interaction of the spins is characterized by their spin-spin relaxation time of 10−5 − 10−6 sec. This interaction is important, since it allows the spin system to reach internal equilibrium; it is however negligible for the energy levels in comparison to the Zeeman energy. For nuclear spins, the interaction with the lattice in this material is so slow (the spin-lattice relaxation time is 1 to 10 min) that the spin system can be regarded as completely isolated for times in the range of seconds. The state of negative temperature is maintained for some minutes, until the magnetization reverses through interactions with the lattice and the temperature returns to its initial value of T . In dilute gases, a state of spin inversion with a lifetime of days can be established. (ii) Lasers (pulsed lasers, ruby lasers): By means of irradiation with light, the atoms of the laser medium are excited (Fig. 6.19). The excited electron drops into a metastable state. When more

Fig. 6.19. Examples of negative temperatures: (a) nuclear spins in a magnetic ﬁeld H, which is rotated by 180◦ (b) a ruby laser. The “pump” raises electrons into an excited state. The electron can fall into a metastable state by emission of a photon. When a population inversion is established, the temperature is negative 40

E. M. Purcell and R. V. Pound, Phys. Rev. 81, 279 (1951)

6.7 Applications to Related Phenomena

323

electrons are in this excited state than in the ground state, i.e. when a population inversion has been established, the state is described by a negative temperature. ∗

6.7.3 The Melting Curve of 3 He

The anomalous behavior of the melting curve of 3 He (Fig. 6.20) is related to the magnetic properties of solid 3 He.41 As we already discussed in connection

Fig. 6.20. The melting curve of 3 He at low temperatures

with the Clausius–Clapeyron equation, dP S S − SL = VS . This leads to the Pomeranchuk eﬀect, already mentioned in Sect. 3.8.2. The above estimate of Tmin yields a value which is a exp factor of 2 smaller than the experimental result, Tmin = 0.3 K. This results from the value of SL , which is too large. Compared to an ideal gas, there are correlations in an interacting Fermi liquid which, as can be understood intuitively, lead to a lowering of its entropy and to a larger value of Tmin .

Before the discovery of the two superﬂuid phases of 3 He, the existence of a maximum in the melting curve below 10−3 K was theoretically discussed.41 It was expected due to the T 3 -dependence of the speciﬁc heat in the antiferromagnetically ordered phase and the linear speciﬁc heat of the Fermi liquid. This picture however changed with the discovery of the superﬂuid phases of 3 He (see Fig. 4.10). The speciﬁc heat of the liquid behaves at low temperatures like e−∆/kT , with a constant ∆ (energy gap), and therefore the melting curve rises for T → 0 and has the slope 0 at T = 0.

Literature A.I. Akhiezer, V.G. Bar’yakhtar and S.V. Peletminskii, Spin Waves, North Holland, Amsterdam, 1968 N.W. Ashcroft and N.D. Mermin, Solid State Physics, Holt, Rinehart and Winston, New York, 1976 R. Becker u. W. D¨ oring, Ferromagnetismus, Springer, Berlin, 1939 W.F. Brown, Magnetostatic Principles in Ferromagnetism, North Holland, Amsterdam, 1962 F. Keﬀer, Spin Waves, Encyclopedia of Physics, Vol. XVIII/2, p. 1. Ferromagnetism, ed. S. Fl¨ ugge (Springer, Berlin, Heidelberg, New York 1966) Ch. Kittel, Introduction to Solid State Physics, 3rd ed., John Wiley, 1967 Ch. Kittel, Thermal Physics, John Wiley, New York, 1969 D.C. Mattis, The Theory of Magnetism, Harper and Row, New York, 1965 A.B. Pippard, Elements of Classical Thermodynamics, Cambridge at the University Press, 1964 H.E. Stanley, Introduction to Phase Transitions and Critical Phenomena, Clarendon Press, Oxford, 1971

Problems for Chapter 6

325

J.H. van Vleck, The Theory of Magnetic and Electric Susceptibilites, Oxford University Press, 1932 D. Wagner, Introduction to the Theory of Magnetism, Pergamon Press, Oxford, 1972

Problems for Chapter 6 6.1 Derive (6.1.24c) for the Hamiltonian of (6.1.25), by taking the second derivative of A(T, M ) = −kT log Tr e−βH + HM with respect to T for ﬁxed M .

6.2 The classical paramagnet: Consider a system of N non-interacting, classical p magnetic moments, µi ( µ2i = m) in a magnetic ﬁeld H, with the Hamiltonian P H =− N i=1 µi H . Calculate the classical partition function, the free energy, the entropy, the magnetization, and the isothermal susceptibility. Refer to the suggestions following Eq. (6.3.12b). 6.3 The quantum-mechanical paramagnet, in analogy to the main text: (a) Calculate the entropy and the internal energy of an ideal paramagnet as a function of T . Show that for T → ∞, S = N k ln (2J + 1) , and discuss the temperature dependence in the vicinity of T = 0. (b) Compute the heat capacities CH and CM for a non-interacting spin-1/2 system.

6.4 The susceptibility and mean square deviation of harmonic oscillators: Consider a quantum-mechanical harmonic oscillator with a charge e in an electric ﬁeld E H=

mω 2 2 p2 + x − eEx . 2m 2

Show that the dielectric susceptibility is given by χ=

∂ex e2 = ∂E mω 2

and that the mean square deviation takes the form ˙ 2¸ x =

βω coth , 2ωm 2

from which it follows that χ=

˙ 2¸ 2 tanh βω 2 x . ω

Compare these results with the paramagnetism of non-coupled magnetic moments! Take account of the diﬀerence between rigid moments and induced moments, and the resulting diﬀerent temperature dependences of the susceptibility. Take the classical limit βω 1.

326

6. Magnetism

6.5 Consider a solid with N degrees of freedom, which are each characterized by two energy levels at ∆ and −∆. Show that „ «2 dE ∆ 1 ∆ , C= = Nk E = −N ∆ tanh ∆ kT dT kT cosh2 kT holds. How does the speciﬁc heat behave for T ∆/k and for T ∆/k?

6.6 When the system described in 6.5 is disordered, so that all values of ∆ within

the interval 0 ≤ ∆ ≤ ∆0 occur with equal probabilities, show that then the speciﬁc heat for kT ∆0 is proportional to T . Hint: The internal energy of this system can be found from problem 6.5 by averaging over all values of ∆. This serves as a model for the linear speciﬁc heat of glasses at low temperatures.

6.7 Demonstrate the validity of the ﬂuctuation-response theorem, Eq. (6.5.35). 6.8 Two defects are introduced into a ferromagnet at the sites x1 and x2 , and produce there the magnetic ﬁelds h1 and h2 . Calculate the interaction energy of these defects for |x1 − x2 | > ξ. For which signs of the hi is there an attractive interaction of the defects? Suggestions: The energy in the molecular ﬁeld approximation is ¯ = P Sl Sl J(l − l ). E l,l For each individual defect, Sl 1,2 = G (xl − x1,2 ) h1,2 , where G is the OrnsteinZernike correlation function. For two defects which are at a considerable distance apart, Sl can be approximated as a linear superposition of the single-defect av¯ for this linear erages. The interaction energy can be obtained by calculating E superposition and subtracting the energies of the single defects.

6.9 The one-dimensional Ising model: Calculate the partition function ZN for a one-dimensional Ising model with N spins obeying the Hamiltonian H=−

N−1 X

Ji Si Si+1 .

i=1

Hint: Prove the recursion relation ZN+1 = 2ZN cosh (JN /kT ).

6.10 (a) Calculate the two-spin correlation function Gi,n := Si Si+n for the onedimensional Ising model in problem 6.9. Hint: The correlation function can be found by taking the appropriate derivatives of the partition function with respect to the interactions. Observe that Si2 = 1. Result: Gi,n = tanhn (J/kT ) for Ji = J. (b) Determine the behavior of the correlation length deﬁned by Gi,n = e−n/ξ for T → 0. (c) Calculate the susceptibility from the ﬂuctuation-response theorem: χ=

N N (gµB )2 X X Si Sj . kT i j

Hint: Consider how many terms with |i − j| = 0, |i − j| = 1, |i − j| = 2 etc. occur in the double sum. Compute the geometric series which appear.

Problems for Chapter 6 Result: (gµB )2 χ= kT

(

„ N

1+α 1−α

« −

´) ` 2α 1 − αN (1 − α)

2

; α = tanh

327

J . kT

(d) Show that in the thermodynamic limit, (N → ∞) χ ∝ ξ for T → 0, and thus γ/ν = 1. (e) Plot χ−1 in the thermodynamic limit as a function of temperature. (f ) How can one obtain from this the susceptibility of an antiferromagneticallycoupled linear chain? Plot and discuss χ as a function of temperature.

6.11 Show that in the molecular-ﬁeld approximation for the Ising model, the internal energy E is given by « „ 1 E = − kTc m2 − hm N 2 and the entropy S by » – ´ ` Tc 1 S = kN − m2 − . hm + log 2 cosh(kTc m + h)/kT T kT Inserting the equation of state, show also that « „ 1+m 1−m 1−m 1+m log + log . S = −kN 2 2 2 2 Finally, expand a(T, m) = e − T s + mh up to the 4th power in m.

6.12 An improvement of the molecular ﬁeld theory for an Ising spin system can be obtained as follows (Bethe–Peierls approximation): the interaction of a spin σ0 with its z neighbors is treated exactly. The remaining interactions are taken into account by means of a molecular ﬁeld h , which acts only on the z neighbors. The Hamiltonian is then given by: H = −h

z X j=1

σj − J

z X

σ0 σj − hσ0 .

j=1

The applied ﬁeld h acts directly on the central spin and is likewise included in h . H is determined self-consistently from the condition σ0 = σj . (a) Show that the partition function Z (h , T ) has the form «–z «–z » „ » „ h h J J e−h/kT + 2 cosh eh/kT Z = 2 cosh + − kT kT kT kT = Z+ + Z− . (b) Calculate the average values σ0 and σj for simplicity with h = 0. Result: σ0 = (Z+ − Z− ) /Z , z 1X 1 ∂ σj = ` h ´ log Z = σj = z j=1 z ∂ kT » « «– „ „ 1 h h J J Z+ tanh + Z− tanh . = − − z kT kT kT kT

328

6. Magnetism

(c) The equation σ0 = σj has a nonzero solution below Tc : “ ” J h cosh kT + kT 1 h = log ´ . `J h kT (z − 1) 2 cosh kT − kT Determine Tc and h by expanding the equation in terms of Result:

h kT

.

J tanh = 1/ (z − 1) kTc j ﬀ „ «2 cosh3 (J/kT ) J 1 h =3 tanh − + ... . kT sinh (J/kT ) kT z−1

6.13 In the so called Weiss model, each of the N spins interacts equally strongly with every other spin: H=−

X 1X J σl σl − h σl . 2 l l,l

ˆ

J . This model can be solved exactly; show that it yields the result of Here, J = N molecular ﬁeld theory.

6.14 Magnons (= spin waves) in ferromagnets. The Heisenberg Hamiltonian, which gives a satisfactory description of certain ferromagnets, is given by H=−

1X J (|xl − xl |) Sl Sl , 2 l,l

where l and l are nearest neighbors on a cubic lattice. By applying the Holstein– Primakoﬀ transformation, √ √ Slz = S − nl Sl+ = 2S ϕ (nl ) al , Sl− = 2S a+ l ϕ (nl ) , p ˆ ` ± ´ ˜ Sl = Slx ± iSly with ϕ (nl ) = 1 − nl /2S, nl = a†l al and al , a†l = δll , as well ˜ ˆ as al , al = 0 – the spin operators are transformed into Bose operators. (a) Show that the commutation relations for the spin operators are fulﬁlled. (b) Represent the Heisenberg Hamiltonian up to second order (harmonic approximation) in the Bose operators al by expanding the square roots in the above transformation in a Taylor series. (c) Diagonalize H (by means of a Fourier transformation) and determine the magnon dispersion relations.

6.15 P (a) Show that a magnon lowers the z-component of the total spin operator Sz ≡

l

Slz by .

(b) Calculate the temperature dependence of the magnetization. (c) Show that in a one- and a two-dimensional spin lattice, there can be no ferromagnetic order at ﬁnite temperatures!

Problems for Chapter 6

329

6.16 Assume a Heisenberg model in an external ﬁeld H, ´ 1X ` J l − l Sl Sl − µ · H , 2 l,l gµB X µ=− Sl .

H=−

l

Show that the isothermal susceptibilities χ|| (parallel to H) and χ⊥ (perpendicular to H) are not negative. Suggestions: Include an additional ﬁeld ∆H in the Hamiltonian and take the derivative with respect to this ﬁeld. For χ|| , i.e. ∆H || H, the assertion follows as in Sect. 3.3 for the compressibility. For an arbitrarily oriented ∆H, it is expedient to use the expansion given in Appendix C.

6.17 Denote the speciﬁc heat at constant magnetization by cM , and at constant ﬁeld by cH . Show that relation (6.1.22c) holds for the isothermal and the adiabatic susceptibility. Volume changes of the magnetic material are to be neglected here. 6.18 A paramagnetic material obeys the Curie law M =c

H , T

where c is a constant. Show, keeping in mind T dS = dE − H dM , that dTad =

H c dH cH T

for an adiabatic change (keeping the volume constant). cH is the speciﬁc heat at constant magnetic ﬁeld.

6.19 A paramagnetic substance obeys the Curie law M =

c T

H (c const.) and its internal energy E is given by E = aT (a > 0, const.). (a) What quantity of heat δQ is released on isothermal magnetization if the magnetic ﬁeld is increased from 0 to H1 ? (b) How does the temperature change if the ﬁeld is now reduced adiabatically from H1 to 0? 4

6.20 Prove the relationships between the shape-dependent and the shape-independent speciﬁc heat (6.6.11a), (6.6.11b) and (6.6.11c). 6.21 Polymers in a restricted geometry: Consider a polymer which is in a coneshaped box (as shown). Why does the polymer move towards the larger opening? (no calculation necessary!)

7. Phase Transitions, Scale Invariance, Renormalization Group Theory, and Percolation

This chapter builds upon the results of the two preceding chapters dealing with the ferromagnetic phase transition and the gas-liquid transition. We start with some general considerations on symmetry breaking and phase transitions. Then a variety of phase transitions and critical points are discussed, and analogous behavior is pointed out. Subsequently, we deal in detail with critical behavior and give its phenomenological description in terms of static scaling theory. In the section that follows, we discuss the essential ideas of renormalization group theory on the basis of a simple model, and use it to derive the scaling laws. Finally, we introduce the Ginzburg–Landau theory; it provides an important cornerstone for the various approximation methods in the theory of critical phenomena. The ﬁrst, introductory section of this chapter exhibits the richness and variety of phase-transition phenomena and tries to convey the fascination of this ﬁeld to the reader. It represents a departure from the main thrust of this book, since it oﬀers only phenomenological descriptions without statistical, theoretical treatment. All of these manifold phenomena connected with phase transitions can be described by a single uniﬁed theory, the renormalization group theory, whose theoretical eﬃcacy is so great that it is also fundamental to the quantum ﬁeld theory of elementary particles.

7.1 Phase Transitions and Critical Phenomena 7.1.1 Symmetry Breaking, the Ehrenfest Classiﬁcation The fundamental laws of Nature governing the properties of matter (Maxwell’s electrodynamics, the Schr¨odinger equation of a many-body system) exhibit a number of distinct symmetry properties. They are invariant with respect to spatial and temporal translations, with respect to rotations and inversions. The states which exist in Nature do not, in general, display the full symmetry of the underlying natural principles. A solid is invariant only with respect to the discrete translations and rotations of its point group. Matter can furthermore exist in diﬀerent states of aggregation or phases, which diﬀer in their symmetry and as a result in their thermal, mechanical,

332

7. Phase Transitions, Renormalization Group Theory, and Percolation

and electromagnetic properties. The external conditions (pressure P , temperature T , magnetic ﬁeld H, electric ﬁeld E, . . .) determine in which of the possible phases a chemical substance with particular internal interactions will present itself. If the external forces or the temperature are changed, at particular values of these quantities the system can undergo a transition from one phase to another: a phase transition takes place. The Ehrenfest Classiﬁcation: as is clear from the examples of phase transitions already treated, the free energy (or some other suitable thermodynamic potential) is a non-analytic function of a control parameter at the phase transition. The following classiﬁcation of phase transitions, due to Ehrenfest , is commonly used: a phase transition of n-th order is deﬁned by the property that at least one of the n-th derivatives of its thermodynamic potential is discontinuous, while all the lower derivatives are continuous at the transition. When one of the ﬁrst derivatives shows a discontinuity, we speak of a ﬁrst-order phase transition; when the ﬁrst derivatives vary continuously but the second derivatives exhibit discontinuities or singularities, we speak of a second-order phase transition (or critical point), or of a continuous phase transition. The understanding of the question as to which phases will be adopted by a particular material under particular conditions certainly belongs among the most interesting topics of the physics of condensed matter. Due to the diﬀering properties of diﬀerent phases, this question is also of importance for materials applications. Furthermore, the behavior of matter in the vicinity of phase transitions is also of fundamental interest. Here, we wish to indicate two aspects in particular: why is it that despite the short range of the interactions, one observes long-range correlations of the ﬂuctuations, in the vicinity of a critical point Tc and long-range order below Tc ? And secondly, what is the inﬂuence of the internal symmetry of the order parameter? Fundamental questions of this type are of importance far beyond the ﬁeld of condensedmatter physics. Renormalization group theory was originally developed in the framework of quantum ﬁeld theory. In connection with critical phenomena, it was formulated by Wilson1 in such a way that the underlying structure of nonlinear ﬁeld theories became apparent, and that also allowed systematic and detailed calculations. This decisive breakthrough led not only to an enormous increase in the knowledge and deeper understanding of condensed matter, but also had important repercussions for the quantum-ﬁeld theoretical applications of renormalization group theory in elementary particle physics. ∗

7.1.2 Examples of Phase Transitions and Analogies

We begin by describing the essential features of phase transitions, referring to Chaps. 5 and 6, where the analogy and the common features between 1

K. G. Wilson, Phys. Rev. B 4, 3174, 3184 (1971)

7.1 Phase Transitions and Critical Phenomena

333

Fig. 7.1a,b. Phase diagrams of (a) a liquid (P -T ) and (b) a ferromagnet (H-T ). (Triple point = T.P., critical point = C.P.)

the liquid-gas transition and the ferromagnetic phase transition were already mentioned, and here we take up their analysis. In Fig. 7.1a,b, the phase diagrams of a liquid and a ferromagnet are shown. The two ferromagnetic ordering possibilities for an Ising ferromagnet (spin “up” and spin “down”) correspond to the liquid and the gaseous phases. The critical point corresponds to the Curie temperature. As a result of the symmetry of the Hamiltonian for H = 0 with respect to the operation σl → −σl for all l, the phase boundary is situated symmetrically in the H-T plane. Ferromagnetic order is characterized by the order parameter m at H = 0. It is zero above Tc and ± m0 below Tc , as shown in the M -T diagram in Fig. 7.1d. The corresponding quantity for the liquid can be seen in the V -T diagram of Fig. 7.1c. Here, the order parameter is (ρL − ρc ) or (ρG − ρc ). In everyday life, we usually observe the liquid-gas transition at constant pressure far below Pc . On heating, the density changes discontinuously as a function of the temperature. Therefore, the vaporization transition is usually considered to be a ﬁrst-order phase transition and the critical point is the end point of the vaporization curve, at which the diﬀerence between the gas and the liquid ceases to exist. The analogy between the gas-liquid and the ferromagnetic transitions becomes clearer if one investigates the liquid in a so called Natterer tube2 . This is a sealed tube in which the substance thus has a ﬁxed, given density. If one chooses the amount of material so that the density is equal to the critical density ρc , then above Tc there is a ﬂuid phase, while on cooling, this phase splits up into a denser liquid phase separated from the less dense gas phase by a meniscus. This corresponds to cooling a ferromagnet 2

See the reference in Sect. 3.8.

334

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.1c,d. The order parameter for (c) the gas-liquid transition (below, two Natterer tubes are illustrated), and for (d) the ferromagnetic transition

at H = 0. Above Tc , the disordered paramagnetic state is present, while below it, the sample splits up into (at least two) negatively and positively oriented ferromagnetic phases.3 Fig. 7.1e,f shows the isotherms in the P -V and M -H diagrams. The similarity of the isotherms becomes clear if the second picture is rotated by 90◦ . In ferromagnets, the particular symmetry again expresses itself. Since the phase boundary curve in the P -T diagram of the liquid is slanted, the horizontal sections of the isotherms in the P -V diagram are not congruent. Finally, Fig. 7.1g,h illustrates the surface of the equation of state. The behavior in the immediate vicinity of a critical point is characterized by power laws with critical exponents which are summarized for ferromagc nets and liquids in Table 7.1. As in Chaps. 5 and 6, τ = T −T Tc . The critical exponents β, γ, δ, α for the order parameter, the susceptibility, the critical isotherm, and the speciﬁc heat are the goal of theory and experiment. Additional analogies will be seen later in connection with the correlation functions and the scattering phenomena which follow from them. 3

In Ising systems there are two magnetization directions; in Heisenberg systems without an applied ﬁeld, the magnetization can be oriented in any arbitrary direction, since the Hamiltonian (6.5.2) is rotationally invariant.

7.1 Phase Transitions and Critical Phenomena

335

Fig. 7.1e,f. The isotherms (e) in the P -V and (f ) in the M -H diagram

Fig. 7.1g,h. The surface of the equation of state for a liquid (g) and for a ferromagnet (h) The general deﬁnition of the value of a critical exponent of a function f (T − Tc ), which is not a priori a pure power law is given by exponent = lim

T →Tc

d log f (T − Tc ) . d log(T − Tc )

(7.1.1)

When f has the form f = a + (T − Tc ), one ﬁnds: d log(a + T − Tc ) 1 = · d log(T − Tc ) a + (T − Tc )

1 d log (T −Tc ) d (T −Tc )

=

T − Tc −→ 0 . a + (T − Tc )

When f is logarithmically divergent, the following expression holds: d log log (T − Tc ) 1 = −→ 0 . d log(T − Tc ) log(T − Tc )

336

7. Phase Transitions, Renormalization Group Theory, and Percolation

In these two cases, the value of the critical exponent is zero. The ﬁrst case occurs for the speciﬁc heat in the molecular ﬁeld approximation, the second for the speciﬁc heat of the two-dimensional Ising model. The reason for introducing critical exponents even for such cases can be seen from the scaling laws which will be treated in the next section. To distinguish between the diﬀerent meanings of the exponent zero (discontinuity and logarithm), one can write 0d and 0log . Table 7.1. Ferromagnets and Liquids: Critical Exponents Ferromagnet

Liquid

Critical behavior

Order parameter

M

(VG,L − Vc ) or (ρG,L − ρc )

(−τ )β

T < Tc

Isothermal susceptibility

Magnetic susceptibility ` ´ χT = ∂M ∂H T

Isothermal compressibility ` ´ κT = − V1 ∂V ∂P T

∝ |τ |−γ

T ≷ Tc

Critical isotherm (T = Tc )

H = H(M )

P = P (V − Vc )

∼ Mδ ∼ (V − Vc )δ

T = Tc

Speciﬁc heat

CM =0 ` ∂S=´CH=0 = T ∂T H

CV = T

∝ |τ |−α

T ≷ Tc

` ∂S ´ ∂T

V

We want to list just a few examples from among the multitude of phase transitions4 . In the area of magnetic substances, one ﬁnds antiferromagnets (e.g. with two sublattices having opposite directions of magnetization M1 and M2 ), ferrimagnets, and helical phases. In an antiferromagnet with two sublattices, the order parameter is N = M1 − M2 , the so called staggered magnetization. In binary liquid mixtures, there are separation transitions, where the order parameter characterizes the concentration. In the case of structural phase transitions, the lattice structure changes at the transition, and the order parameter is given by the displacement ﬁeld or the strain tensor. Examples are ferroelectrics5 and distortive transitions, where the order parameter is given by e.g. the electric polarization P or the rotation angle ϕ of a molecular group. Finally, there are transitions into macroscopic quantum states, i.e. superﬂuidity and superconductivity. Here, the order parameter is a complex ﬁeld ψ, the macroscopic wavefunction, and the broken symmetry is the gauge invariance with respect to the phase of ψ. In the liquid-solid transition, the translational symmetry is broken and the order parameter is 4

5

We mention two review articles in which the literature up to 1966 is summarized: M. E. Fisher, The Theory of Equilibrium Critical Phenomena, p. 615; and P. Heller, Experimental Investigations of Critical Phenomena, p. 731, both in Reports on Progress in Physics XXX (1967). In a number of structural phase transitions, the order parameter jumps discontinuously to a ﬁnite value at the transition temperature. In this case, according to Ehrenfest’s classiﬁcation, we are dealing with a ﬁrst-order phase transition.

7.1 Phase Transitions and Critical Phenomena

337

a component of the Fourier-transformed density. This transition line does not end in a critical point. Table 7.2 lists the order parameter and an example of a typical substance for some of these phase transitions. Table 7.2. Phase transitions (critical points), order parameters, and substances Phase transition

Order parameter

Substance

Paramagnet–ferromagnet (Curie temperature)

Magnetization M

Fe

Paramagnet–antiferromagnet (N´eel temperature)

staggered N = M1 − M2 magnetization

RbMnF3

Gas-liquid (Critical point)

Density

Separation of binary liquid mixtures

Concentration c − cc

Order–disorder transitions

Sublattice occupation

NA − NB

Cu-Zn

Paraelectric–ferroelectric

Polarization

P

BaTiO3

Distortive structural transitions

Rotation angle ϕ

SrTiO3

Elastic phase transitions

Strain

He I–He II (Lambda point)

ρ − ρc

Methanoln-Hexane

Bose condensate

Normal conductor–superconductor Cooper-pair amplitude

CO2

KCN 4

Ψ

∆

He

Nb3 Sn

In general, the order parameter is understood to be a quantity which is zero above the critical point and ﬁnite below it, and which characterizes the structural or other changes which occur in the transition, such as the expectation value of lattice displacements or a component of the total magnetic moment. To clarify some concepts, we discuss at this point a generalized anisotropic, ferromagnetic Heisenberg model : 3 1 2 H=− J (l − l )σlz σlz + J⊥ (l − l )(σlx σlx + σly σly ) −h σlz , (7.1.2) 2 l,l

(σlx , σly , σlz )

l

is the three-dimensional Pauli spin operator at lattice where σ l = site xl and N is the number of lattice sites. This Hamiltonian contains the uniaxial ferromagnet for J (l − l ) > J⊥ (l − l ) ≥ 0, and for J⊥ (l − l ) = J (l − l ), it describes the isotropic Heisenberg model (6.5.2). In the former case, the

338

7. Phase Transitions, Renormalization Group Theory, and Percolation

order parameter referred to the number of lattice sites (h = 0) is the single component quantity N1 l σlz , i.e. the number of components n is n = 1. In the latter case, the order parameter is N1 l σ l , which can point in any arbitrary direction (h = 0 !); here, the number of components is n = 3. For J⊥ (l − l ) > J (l − l ) ≥ 0, we ﬁnd the so called planar ferromagnet , in which the order parameter N1 l (σlx , σly , 0) has two components, n = 2. A special case of the uniaxial ferromagnet is the Ising model (6.5.4), with J⊥ (l−l ) = 0. The uniaxial ferromagnet has the following symmetry elements: all rotations around the z-axis, the discrete symmetry (σlx , σly , σlz ) → (σlx , σly , −σlz ) and products thereof. Below Tc , the invariance with respect to this discrete symmetry is broken. In the planar ferromagnet, the (continuous) rotational symmetry around the z-axis, and in the case of the isotropic Heisenberg model, the O(3) symmetry – i.e. the rotational invariance around an arbitrary axis – is broken. One couldask why e.g. for the the Ising Hamiltonian without an external ﬁeld, N1 σl can ever be nonzero, since from the invariance operation {σlz } → {−σlz }, it follows that N1 l σlz = − N1 l σlz . In a ﬁnite system, 1 σlz h is analytic in h for ﬁnite h, and N 1 z σl =0. (7.1.3) lim h→0 N l

h

For ﬁnite N , conﬁgurations with spins oriented opposite to the ﬁeld also contribute to the partition function, and their weight increases with decreasing values of h. The mathematically precise deﬁnition of the order parameter is: 1 z σ = lim lim σl ; (7.1.4) h→0 N →∞ N l

h

ﬁrst, the thermodynamic limit N → ∞ is taken, and then h → 0. This quantity can be nonzero below Tc . For N → ∞, states with the ‘wrong’ orientation have vanishing weights in the partition function for arbitrarily small but ﬁnite ﬁelds. 7.1.3 Universality In the vicinity of critical points, the topology of the phase diagrams of such diverse systems as a gas-liquid mixture and a ferromagnetic material are astonishingly similar; see Fig. 7.1. Furthermore, experiments and computer simulations show that the critical exponents for the corresponding phase transitions for broad classes of physical systems are the same and depend only on the number of components and the symmetry of the order parameter, the spatial dimension and the character of the interactions, i.e. whether short-ranged, or long-ranged (e.g. Coulomb, dipolar forces). This remarkable feature is termed universality. The microscopic details of these strongly interacting many-body

7.2 The Static Scaling Hypothesis

339

systems express themselves only in the prefactors (amplitudes) of the power laws, and even the ratios of these amplitudes are universal numbers. The reason for this remarkable result lies in the divergence of the cor −ν c relation length ξ = ξ0 T −T . On approaching Tc , ξ becomes the only Tc relevant length scale of the system, which at long distances dominates all of the microscopic scales. Although the phase transition is caused as a rule by short-range interactions of the microscopic constituents, due to the longrange ﬂuctuations (see 6.12), the dependence on the microscopic details such as the lattice structure, the lattice constant, or the range of the interactions (as long as they are short-ranged) is secondary. In the critical region, the system behaves collectively, and only global features such as its spatial dimension and its symmetry play a role; this makes the universal behavior understandable. The universality of critical phenomena is not limited to materials classes, but instead it extends beyond them. For example, the static critical behavior of the gas-liquid transition is the same as that of Ising ferromagnets. Planar ferromagnets behave just like 4 He at the lambda point. Even without making use of renormalization group theory, these relationships can be understood with the aid of the following transformations6 : the grand partition function of a gas can be approximately mapped onto that of a lattice gas which is equivalent to a magnetic Ising model (occupied/unoccupied cells = 4 spin up/down). The Hamiltonian of a Bose liquid can be mapped onto that of a planar ferromagnet. The gauge invariance of the Bose Hamiltonian corresponds to the two-dimensional rotational invariance of the planar ferromagnet.

7.2 The Static Scaling Hypothesis7 7.2.1 Thermodynamic Quantities and Critical Exponents In this section, we discuss the analytic structure of the thermodynamic quantities in the vicinity of the critical point and draw from it typical conclusions about the critical exponents. This generally-applicable procedure will be demonstrated using the terminology of ferromagnetism. In the neighborhood of Tc , the equation of state according to Eq. (6.5.16) takes on the form 6 7

See e.g. M. E. Fisher, op. cit., and problem 7.16. Although the so called scaling theory of critical phenomena can be derived microscopically through renormalization group theory (see Sect. 7.3.4), it is expedient for the following reasons to ﬁrst introduce it on a phenomenological basis: (i) as a motivation for the procedures of renormalization group theory; (ii) as an illustration of the structure of scaling considerations for physical situations where ﬁeld-theoretical treatments based on renormalization group theory are not yet available (e.g. for many nonequilibrium phenomena). Scaling treatments, starting from critical phenomena and high-energy scaling in elementary particle physics, have acquired a great inﬂuence in the most diverse ﬁelds.

340

7. Phase Transitions, Renormalization Group Theory, and Percolation

h 1 = τ m + m3 kTc 3

(7.2.1)

3 1 m which can be rearranged as follows: kT1 c |τ |h3/2 = sgn(τ ) |τ m . 1/2 + 3 1/2 | |τ | Solving for m, we obtain the following dependence of m on τ and h: ) * h 1/2 m(τ, h) = |τ | m± for T ≷ Tc . (7.2.2) 3/2 |τ | The functions m± for T ≷ Tc are determined by (7.2.1). In the vicinity of the critical point, the magnetization depends on τ and h in a very special 1/2 3/2 way: apart from the factor |τ | , it depends only on the ratio h/|τ | . The magnetization is a generalized homogeneous function of τ and h. This implies that (7.2.2) is invariant with respect to the scale transformation h → hb3 , τ → τ b2 ,

and m → mb .

This scaling invariance of the physical properties expresses itself for example in the speciﬁc heat of 4 He at the lambda point (Fig. 7.2). We know from Chap. 6 and Table 7.1 that the real critical exponents diﬀer from their molecular-ﬁeld values in (7.2.2). It is therefore reasonable to extend the equation of state (7.2.2) to arbitrary critical exponents8 : ) * h β m(τ, h) = |τ | m± ; (7.2.3) δβ |τ | in this expression, β and δ are critical exponents and the m± are called scaling functions. At the present stage, (7.2.3) remains a hypothesis; it is, however, possible to prove this hypothesis using renormalization group theory, as we shall demonstrate later in Sect. 7.3, for example in Eq. (7.3.40). For the present, we take (7.2.3) as given and ask what its general consequences are. The two scaling functions m± (y) must fulﬁll certain boundary conditions which follow from the critical properties listed in Eq. (6.5.31) and Table 7.1. The magnetization is always oriented parallel to h when the applied ﬁeld is nonzero and remains ﬁnite in the limit h → 0 below Tc , while above Tc , it goes to zero: 8

In addition to being a natural generalization of molecular ﬁeld theory, one can understand the scaling hypothesis (7.2.3) by starting with the fact that singularities are present only for τ = 0 and h = 0. How strong the eﬀects of the singularities will be depends on the distance from the critical point, τ , and on h/|τ |βδ , i.e. the ratio between the applied ﬁeld and the ﬁeld-equivalent of τ ; that is hτ = mδ = |τ |βδ . As long as h hτ , the system is eﬀectively in the lowﬁeld limit and m ≈ |τ |β m± (0). On the other hand, if τ becomes so small that |τ | ≤ h1/βδ , then the inﬂuence of the applied ﬁeld predominates. Any additional reduction of τ produces no further change: m remains at the value which it had 1 for |τ | = h1/βδ , i.e. h δ m± (1). In the limit τ → 0, m± (y) −→ y β must hold, so that the singular dependence on τ in m(τ, h) cancels out.

7.2 The Static Scaling Hypothesis

341

Fig. 7.2. The speciﬁc heat at constant pressure, cP , at the lambda transition of 4 He. The shape of the speciﬁc heat stays the same on changing the temperature scale (1 K to 10−6 K)

lim m− (y) = sgn y ,

y→0

m+ (0) = 0 .

(7.2.4a)

The thermodynamic functions are non-analytic precisely at τ = 0, h = 0. For nonzero h, the magnetization is ﬁnite over the whole range of temperatures β and remains an analytic function of τ even for τ = 0; the |τ | dependence δβ of (7.2.3) must be compensated by the function m± (h/|τ | ). Therefore, the two functions m± must behave as lim m± (y) ∝ y 1/δ

(7.2.4b)

y→∞

for large arguments. It follows from this that for τ = 0, i.e. at the critical point, m ∼ h1/δ . The scaling functions m± (y) are plotted in Fig. 7.3. Eq. (7.2.3), like the molecular-ﬁeld version of the scaling law given above, requires that the magnetization must be a generalized homogeneous function of τ and h and is therefore invariant with respect to scale transformations: h → hb

βδ ν

1

, τ → τb ν ,

and m → mbβ/ν .

The name scaling law is derived from this scale invariance. Equation (7.2.3) contains additional information about the thermodynamics; by integration,

342

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.3. The qualitative behavior of the scaling functions m±

we can determine the free energy and by taking suitable derivatives we can ﬁnd the magnetic susceptibility and the speciﬁc heat. From these we obtain relations between the critical exponents. For the susceptibility, we ﬁnd the scaling law from Eq. (7.2.3):

∂m h β−δβ χ≡ , (7.2.5) = |τ | m± δβ ∂h T |τ | β−δβ

and in the limit h → 0, we thus have χ ∝ |τ | . It then follows that the critical exponent of the susceptibility, γ (Eq. (6.5.31c)), is given by γ = −β(1 − δ) .

(7.2.6)

The speciﬁc free energy is found through integration of (7.2.3): h/|τ |δβ h β+δβ f − f0 = − dh m(τ, h) = −|τ | dx m± (x) . h0

h0 /|τ |δβ

Here, h0 must be suﬃciently large so that the starting point for the integration lies outside the critical region. The free energy then takes on the following form:

h β+δβ ˆ f (τ, h) = |τ | + freg . f± (7.2.7) βδ |τ | In this expression, fˆ is deﬁned by the value resulting from the upper limit of the integral and freg is the non-singular part of the free energy. The speciﬁc heat at constant magnetic ﬁeld is obtained by taking the second derivative of (7.2.7), ch = −

∂ 2f β(1+δ)−2 ∼ A± |τ | + B± . ∂τ 2

(7.2.8)

The A± in this expression are amplitudes and the B± come from the regular part. Comparison with the behavior of the speciﬁc heat as characterized by the critical exponent α (Eq. (6.5.31d)) yields

7.2 The Static Scaling Hypothesis

α = 2 − β(1 + δ) .

343

(7.2.9)

The relations between the critical exponents are termed scaling relations, since they follow from the scaling laws for the thermodynamic quantities. If we add (7.2.6) and (7.2.9), we obtain γ + 2β = 2 − α .

(7.2.10)

From (7.2.6) and (7.2.9), one can see that the remaining thermodynamic critical exponents are determined by β and δ. 7.2.2 The Scaling Hypothesis for the Correlation Function In the molecular ﬁeld approximation, we obtained the Ornstein–Zernike behavior in Eqns. (6.5.50) and (6.5.53 ) for the wavevector-dependent susceptibility χ(q) and the correlation function G(x): 2

χ(q) =

(qξ) 1 , ˜ 2 1 + (qξ)2 Jq

G(x) =

kTc v e−|x|/ξ 4π J˜ |x|

with ξ = ξ0 τ − 2 . (7.2.11) 1

The generalization of this law is (q a−1 , |x| a, ξ a with the lattice constant a): χ(q) =

1 χ ˆ qξ , q 2−η

G(x) =

1 1+η

|x|

ˆ |x|/ξ , G

ξ = ξ0 τ −ν , (7.2.12a,b,c)

ˆ where the functions χ(qξ) ˆ and G(|x|/ξ) are still to be determined. In (7.2.12c), we assumed that the correlation length ξ diverges at the critical point. This divergence is characterized by the critical exponent ν. Just at Tc , ξ = ∞ and therefore there is no longer any ﬁnite characteristic length; the correlation function G(x) can thus only fall oﬀ according to a power law ˆ G(x) ∼ |x|11+η G(0). The possibility of deviations from the 1/|x|-behavior of the Ornstein–Zernike theory was taken into account by introducing the additional critical exponent η. In the immediate vicinity of Tc , ξ is the only relevant length and therefore the correlation function also contains the factor ˆ G(|x|/ξ). Fourier transformation of G(x) yields (7.2.12a) for the wavevectordependent susceptibility, which for its part represents an evident generalization of the Ornstein–Zernike expression. We recall (from Sects. 5.4.4 and 6.5.5.2) that the increase of χ(q) for small q on approaching Tc leads to critical opalescence. In (7.2.11) and (7.2.12b), a three-dimensional system was assumed. Phase transitions are of course highly interesting also in two dimensions, and furthermore it has proved fruitful in the theory of phase transitions to consider arbitrary dimensions (even non-integral dimensions). We therefore generalize the relations to arbitrary dimensions d:

344

7. Phase Transitions, Renormalization Group Theory, and Percolation

G(x) =

1 d−2+η

|x|

ˆ |x|/ξ . G

(7.2.12b )

Equations (7.2.12a) and (7.2.12c) remain valid also in d dimensions, whereby ˆ and χ of course the exponents ν and η and the form of the functions G ˆ depend on the spatial dimension. From (7.2.12a) and (7.2.12b ) at the critical point we obtain G(x) ∝

1 d−2+η

|x|

and χ ∝

1

for

q 2−η

T = Tc .

(7.2.13)

ˆ Here, we have assumed that G(0) and χ(∞) ˆ are ﬁnite, which follows from the ﬁnite values of G(x) at ﬁnite distances and of χ(q) at ﬁnite wavenumbers (and ξ = ∞). We now consider the limiting case q → 0 for temperatures T = Tc . Then we ﬁnd from (7.2.12a) 2−η

χ = lim χ(q) ∝ q→0

(qξ) q 2−η

= ξ 2−η .

(7.2.14)

This dependence is obtained on the basis of the following arguments: for ﬁnite ξ, the susceptibility remains ﬁnite even in the limit q → 0. Therefore, 1 the factor q2−η in (7.2.12a) must be compensated by a corresponding dependence of χ(qξ), ˆ from which the relation (7.2.14) follows for the homogeneous susceptibility. Since its divergence is characterized by the critical exponent γ according to (6.5.31c), it follows from (7.2.14) together with (7.2.12c) that there is an additional scaling relation γ = ν(2 − η) .

(7.2.15)

Relations of the type (7.2.3), (7.2.7), and (7.2.12b ) are called scaling laws, since they are invariant under the following scale transformations: x → x/b,

ξ → ξ/b,

m → mbβ/ν ,

τ → τ b1/ν ,

fs → fs b(2−α)/ν ,

h → hbβδ/ν

G → Gb(d−2+η)/ν ,

(7.2.16)

where fs stands for the singular part of the (speciﬁc) free energy. If we in addition assume that these scale transformations are based on a microscopic elimination procedure by means of which the original system with lattice constant a and N lattice sites is mapped onto a new system with the same lattice constant a but a reduced number N b−d of degrees of freedom, then we ﬁnd Fs (τ, h) Fs (τ b1/ν , hbβδ/ν ) , = b−d N N b−d which implies the hyperscaling relation

(7.2.17)

7.3 The Renormalization Group

2 − α = dν ,

345

(7.2.18)

which also contains the spatial dimension d. According to equations (7.2.6), (7.2.9), (7.2.15), and (7.2.18), all the critical exponents are determined by two independent ones. For the two-dimensional Ising model one ﬁnds the exponents of the correlation function, ν = 1 and η = 1/4, from the exponents quoted following Eq. (6.5.31d) and the scaling relations (7.2.15) and (7.2.18).

7.3 The Renormalization Group 7.3.1 Introductory Remarks The term ‘renormalization’ of a theory refers to a certain reparametrization with the goal of making the renormalized theory more easily dealt with than the original version. Historically, renormalization was developed by St¨ uckelberg and Feynman in order to remove the divergences from quantumﬁeld theories such as quantum electrodynamics. Instead of the bare parameters (masses, coupling constants), the Lagrange function is expressed in terms of physical masses and coupling coeﬃcients, so that ultraviolet divergences due to virtual transitions occur only within the connection between the bare and the physical quantities, leaving the renormalized theory ﬁnite. The renormalization procedure is not unique; the renormalized quantities can for example depend upon a cutoﬀ length scale, up to which certain virtual processes are taken into account. Renormalization group theory studies the dependence on this length scale, which is also called the “ﬂow parameter”. The name “renormalization group” comes from the fact that two consecutive renormalization group transformations lead to a third such transformation. In the ﬁeld of critical phenomena, where one must explain the observed behavior at large distances (or in Fourier space at small wavenumbers), it is reasonable to carry out the renormalization procedure by a suitable elimination of the short-wavelength ﬂuctuations. A partial evaluation of the partition function in this manner is easier to carry out than the calculation of the complete partition function, and can be done using approximation methods. As a result of the elimination step, the remaining degrees of freedom are subject to modiﬁed, eﬀective interactions. Quite generally, one can expect the following advantages from such a renormalization group transformation: (i) The new coupling constants could be smaller. By repeated applications of the renormalization procedure, one could thus ﬁnally obtain a practically free theory, without interactions. (ii) The successively iterated coupling coeﬃcients, also called “parameter ﬂow”, could have a ﬁxed point, at which the system no longer changes

346

7. Phase Transitions, Renormalization Group Theory, and Percolation

under additional renormalization group transformations. Since the elimination of degrees of freedom is accompanied by a change of the underlying lattice spacing, or length scale, one can anticipate that the ﬁxed points are under certain circumstances related to critical points. Furthermore, it can be hoped that the ﬂow in the vicinity of these ﬁxed points can yield information about the universal physical quantities in the neighborhood of the critical points. The scenario described under (i) will in fact be found for the one-dimensional Ising model, and that described under (ii) for the two-dimensional Ising model. The renormalization group method brings to bear the scale invariance in the neighborhood of a critical point. In the case of so called real-space transformations (in contrast to transformation in Fourier space), one eliminates certain degrees of freedom which are deﬁned on a lattice, and thus carries out a partial trace operation on the partition function. The lattice constant of the resulting system is then readjusted and the internal variables are renormalized in such a manner that the new Hamiltonian corresponds to the original one in its form. By comparison, one deﬁnes eﬀective, scale-independent coupling constants, whose ﬂow behavior is then investigated. We ﬁrst study the one-dimensional Ising model and then the two-dimensional. Finally, the general structure of such transformations will be discussed with the derivation of scaling laws. A brief schematic treatment of continuous ﬁeld-theoretical formulations will be undertaken following the Ginzburg–Landau theory. 7.3.2 The One-Dimensional Ising Model, Decimation Transformation We will ﬁrst illustrate the renormalization group method using the onedimensional Ising model, with the ferromagnetic exchange constant J in zero applied ﬁeld, as an example. The Hamiltonian is H = −J σl σl+1 , (7.3.1) l

where l runs over all the sites in the one-dimensional chain; see Fig. 7.4. We introduce the abbreviation K = J/kT into the partition function for N spins with periodic boundary conditions σN +1 = σ1 , P eK l σl σl+1 . (7.3.2) ZN = Tr e−H/kT = {σl =±1}

The decimation procedure consists in partially evaluating the partition function, by carrying out the sum over every second spin in the ﬁrst step. In Fig. 7.4, the lattice sites for which the trace is taken are marked with a cross.

7.3 The Renormalization Group

347

Fig. 7.4. An Ising chain; the trace is carried out over all the lattice points which are marked with a cross. The result is a lattice with its lattice constant doubled

A typical term in the partition function is then eKσl (σl−1 +σl+1 ) = 2 cosh K(σl−1 + σl+1 ) = e2g+K σl−1 σl+1 , (7.3.3) σl =±1

with coeﬃcients g and K which are still to be determined. Here, we have taken the sum over σl = ±1 after the ﬁrst equals sign. Since cosh K(σl−1 + σl+1 ) depends only on whether σl−1 and σl+1 are parallel or antiparallel, the result can in any case be brought into the form given after the second equals sign. The coeﬃcients g and K can be determined either by expansion of the exponential function or, still more simply, by comparing the two expressions for the possible orientations. If σl−1 = −σl+1 , we ﬁnd

2 = e2g−K ,

(7.3.4a)

and if σl−1 = σl+1 , the result is

2 cosh 2K = e2g+K .

(7.3.4b)

From the product of (7.3.4a) and (7.3.4b) we obtain 4 cosh 2K = e4g , and from the quotient, cosh 2K = e2K ; thus the recursion relations are: 1 log cosh 2K 2 1 g = log 2 + K . 2

K =

(7.3.5a) (7.3.5b)

Repeating this decimation procedure a total of k times, we obtain from (7.3.5a,b) for the kth step the following recursion relation: 1 K (k) = log cosh 2K (k−1) (7.3.6a) 2 1 1 g(K (k) ) = log 2 + K (k) . (7.3.6b) 2 2 The decimation produces another Ising model with an interaction between nearest neighbors having a coupling constant K (k) . Furthermore, a spinindependent contribution g(K (k) ) to the energy is generated; in the kth step, it is given by (7.3.6b). In a transformation of this type, it is expedient to determine the ﬁxed points which in the present context will prove to be physically relevant. Fixed

348

7. Phase Transitions, Renormalization Group Theory, and Percolation

points are those points K ∗ which are invariant with respect to the transformation, i.e. here K ∗ = 12 log(cosh 2K ∗ ). This equation has two solutions, K∗ = 0

(T = ∞) and K ∗ = ∞

(T = 0) .

(7.3.7)

The recursion relation (7.3.6a) is plotted in Fig. 7.5. Starting with the initial value K0 , one obtains K (K0 ), and by a reﬂection in the line K = K, K (K (K0 )), and so forth. One can see that the coupling constant decreases continually; the system moves towards the ﬁxed point K ∗ = 0, i.e. a noninteracting system. Therefore, for a ﬁnite K0 , we never arrive at an ordered state: there is no phase transition. Only for K = ∞, i.e. for a ﬁnite exchange interaction J and T = 0, do the spins order.

Fig. 7.5. The recursion relation for the onedimensional Ising model with interactions between nearest neighbors (heavy solid curve), the line K = K (dashed), and the iteration steps (thin lines with arrows)

Making use of this renormalization group (RG) transformation, we can calculate the partition function and the free energy. The partition function for all together N spins with the coupling constant K, using (7.3.3), is

ZN (K) = eN g(K ) Z N (K ) = eN g(K 2

and, after the nth step, n ZN (K) = exp N k=1

)+ N 2 g(K )

Z N2 (K ) ,

(7.3.8)

2

1 (k) + log Z Nn K (n) . g K 2 2k−1

(7.3.9)

The reduced free energy per lattice site and kT is deﬁned by 1 f˜ = − log ZN (K) . N

(7.3.10)

7.3 The Renormalization Group

349

As we have seen, the interactions become weaker as a result of the renormalization group transformation, which gives rise to the following possible application: after several steps the interactions have become so weak that perturbation-theory methods can be used, or the interaction can be altogether neglected. Setting K (n) ≈ 0, from (7.3.9) we obtain the approximation: f˜(n) (K) = −

n k=1

1 2k−1

1 g K (k) − n log 2 , 2

(7.3.11)

since the free energy per spin of a ﬁeld-free spin-1/2 system without interactions is − log 2. Fig. 7.6 shows f˜(n) (K) for n = 1 to 5. We can see how quickly this approximate solution approaches the exact reduced free energy f˜(K) = − log(2 cosh K). The one-dimensional Ising model can be exactly solved by elementary methods (see problem 6.9), as well as by using the transfer matrix method, cf. Appendix F.

Fig. 7.6. The reduced free energy of the onedimensional Ising model. f˜ is the exact free energy, f˜(1) , f˜(2) , . . . are the approximations (7.3.11)

7.3.3 The Two-Dimensional Ising Model The application of the decimation procedure to the two-dimensional Ising model is still more interesting, since this model exhibits a phase transition at a ﬁnite temperature Tc > 0. We consider the square lattice rotated by 45◦ which is illustrated in Fig. 7.7, with a lattice constant of one. The Hamiltonian multiplied by β, H = βH, is H =− Kσi σj , (7.3.12) n.n.

350

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.7. A square spin lattice, rotated by 45◦ . The lattice sites are indicated by points. In the decimation transformation, the spins at the sites which are also marked by a cross are eliminated. K is the interaction between nearest neighbors and L is the interaction between next-nearest neighbors

where the sum runs over all pairs of nearest neighbors (n.n.) and K = J/kT . When in the partial evaluation of the partition function the trace is taken over the spins marked by crosses, we obtain a new square lattice of lattice √ constant 2. How do the coupling constants transform? We pick out one of the spins with a cross, σ, denote its neighbors as σ1 , σ2 , σ3 , and σ4 , and evaluate their contribution to the partition function: eK(σ1 +σ2 +σ3 +σ4 )σ = elog(2 cosh K(σ1 +σ2 +σ3 +σ4 )) σ=±1 (7.3.13) A + 12 K (σ1 σ2 ...+σ3 σ4 )+L (σ1 σ3 +σ2 σ4 )+M σ1 σ2 σ3 σ4 =e . This transformation (taking a partial trace) yields a modiﬁed interaction between nearest neighbors, K (here, the elimination of two crossed spins contributes); in addition, new interactions between the next-nearest neighbors (such as σ1 and σ3 ) and a four-spin interaction are generated: H = A + K σi σj + L σi σj + . . . . (7.3.12 ) ”u.n.N.

n.N.

The coeﬃcients A , K , L and M can readily be found from (7.3.13) as functions of K, by using σi 2 = 1, i = 1, . . . , 4 (see problem 7.2): 3 12 log cosh 4K + 4 log cosh 2K , 8 1 1 K (K) = log cosh 4K , L (K) = K (K) 4 2 2 3 1 log cosh 4K − 4 log cosh 2K . M (K) = 8 A (K) = log 2 +

(7.3.14) (7.3.13 )

Putting the critical value Kc = J/kTc = 0.4406 (exact result9 ) into this relation as an estimate for the initial value K, we ﬁnd M L ≤ K . In 9

The partition function of the Ising model on a square lattice without an external ﬁeld was evaluated exactly by L. Onsager, Phys. Rev. 65, 117 (1944), using the transfer matrix method (see Appendix F.).

7.3 The Renormalization Group

351

the ﬁrst elimination step, the original Ising model is transformed into one with three interactions; in the next step we must take these into account and obtain still more interactions, and so on. In a quantitatively usable calculation it will thus be necessary to determine the recursion relations for an extended number of coupling constants. Here, we wish only to determine the essential structure of such recursion relations and to simplify them suﬃciently so that an analytic solution can be found. Therefore, we neglect the coupling constant M and all the others which are generated by the elimination procedure, and restrict ourselves to K and L as well as their initial values K and L. This is suggested by the smallness of M which we mentioned above. We now require the recursion relation including the coupling constant L, which acts between σ1 and σ4 , etc. Thus, expanding (7.3.13 ) up to second order in K and taking note of the fact that an interaction L between nextnearest neighbors in the original Hamiltonian appears as a contribution to the interactions of the nearest neighbors in the primed Hamiltonian, we ﬁnd the following recursion relations on elimination of the crossed spins (Fig. 7.7): K = 2K 2 + L

2

L =K .

(7.3.15a) (7.3.15b)

These relations can be arrived at intuitively as follows: the spin σ mediates an interaction of the order of K times K, i.e. K 2 between σ1 and σ3 , likewise the crossed spin just to the left of σ. This leads to 2K 2 in K . The interaction L between next-nearest neighbors in the original model makes a direct contribution to K . Spin σ also mediates a diagonal interaction between σ1 and σ4 , leading thus to the relation L = K 2 in (7.3.15b). However, it should be clear that in contrast to the one-dimensional case, new coupling constants are generated in every elimination step. One cannot expect that these recursion relations, which have been restricted as an approximation to a reduced parameter space (K, L), will yield quantitatively accurate results. They do contain all the typical features of this type of recursion relations. In Fig. 7.8, we have shown the recursion relations (7.3.15a,b)10. Starting from values (K, 0), the recursion relation is repeatedly applied, likewise for initial values (0, L). The following picture emerges: for small initial values, the ﬂux lines converge to K = L = 0, and for large initial values they converge to K = L = ∞. These two regions are separated by two lines, which meet at Kc∗ = 13 and L∗c = 19 . Further on it will become clear that this ﬁxed point is connected to the critical point. We now want to investigate analytically the more important properties of the ﬂow diagram which follows from the recursion relations (7.3.15a,b). As a 10

For clarity we have drawn in only every other iteration step in Fig. 7.8. We will return to this point at the end of this section, after investigating the analytic behavior of the recursion relation.

352

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.8. A ﬂow diagram of Eq. (7.3.15a,b) (only every other point is indicated.) Three ﬁxed points can be recognized: K ∗ = L∗ = 0, K ∗ = L∗ = ∞ and Kc∗ = 13 , L∗c = 19

ﬁrst step, the ﬁxed points must be determined from (7.3.15a,b), i.e. K ∗ and L∗ , which obey K ∗ = 2K ∗ 2 + L∗ and L∗ = K ∗ . These conditions give three ﬁxed points (i)

K ∗ = L∗ = 0,

(ii) K ∗ = L∗ = ∞,

and (iii)

Kc∗ =

1 1 , L∗c = . 3 9 (7.3.16)

The high-temperature ﬁxed point (i) corresponds to a temperature T = ∞ (disordered phase), while the low-temperature ﬁxed point (ii) corresponds to T = 0 (ordered low-temperature phase). The critical behavior can be related only to the non-trivial ﬁxed point (iii), (Kc∗ , L∗c ) = ( 13 , 19 ). That the initial values of K and L which lead to the ﬁxed point (Kc∗ , L∗c ) represent critical points can be seen in the following manner: the RG transformation leads to a lattice with its lattice constant increased by a factor of √ 2. The correlation length of the transformed system ξ is thus smaller by a √ factor of 2: √ (7.3.17) ξ = ξ/ 2 . However, at the ﬁxed point, the coupling constants Kc∗ , L∗c are invariant, so that for ξ of√the ﬁxed point, we have ξ = ξ , i.e. at the ﬁxed point, it follows that ξ = ξ/ 2, thus ∞ or ξ= (7.3.18) 0 . The value 0 corresponds to the high-temperature and to the low-temperature ﬁxed points. At ﬁnite K ∗ , L∗ , ξ cannot be zero, but only ∞. Calculating

7.3 The Renormalization Group

353

back through the transformation shows that the correlation length at each point along the critical trajectory which leads to the ﬁxed point is inﬁnite. Therefore, all the points of the “critical trajectory”, i.e. the trajectory leading to the ﬁxed point, are critical points of Ising models with nearest-neighbor and next-nearest-neighbor interactions. In order to determine the critical behavior, we examine the behavior of the coupling constants in the vicinity of the “non-trivial” ﬁxed point; to this end, we linearize the transformation equations (7.3.15a,b) around (Kc∗ , L∗c ) in the lth step: δKl = Kl − Kc∗

,

δLl = Ll − L∗c .

We thereby obtain the following linear recursion relation: ⎛ ⎞ ⎛ ∗ ⎞⎛ ⎞ ⎛4 ⎞⎛ ⎞ δKl δKl−1 δKl−1 4Kc 1 3 1 ⎝ ⎠⎝ ⎠=⎝ ⎠ = ⎝ ⎠⎝ ⎠ . 2 δLl δLl−1 0 δL 2Kc∗ 0 l−1 3

(7.3.19)

(7.3.20)

The eigenvalues of the transformation matrix can be determined from λ2 − 43 λ − 23 = 0 , i.e. √ 1 1.7208 λ1,2 = (2 ± 10) = (7.3.21a) 3 −0.3874 . √ ´ ` The associated eigenvectors can be obtained from 4 − (2 ± 10) δK + 3δL = 0 , i.e. √ 10 − 2 δK and thus δL = ± 3 √ « « „ √ „ (7.3.21b) 10 − 2 10 + 2 and e2 = 1, − e1 = 1, 3 3 with the scalar product e1 · e2 =

1 3

.

We now start from an Ising model with coupling constants K0 and L0 (including the division by kT ). We ﬁrst expand the deviations of the initial coupling constants K0 and L0 from the ﬁxed point in the basis of the eigenvectors (7.3.21):

∗ K0 Kc = + c1 e1 + c2 e2 , (7.3.22) L0 L∗c with expansion coeﬃcients c1 and c2 . The decimation procedure is repeated several times; after l transformation steps, we obtain the coupling constants Kl and Ll :

∗ Kl Kc = + λl1 c1 e1 + λl2 c2 e2 . (7.3.23) Ll L∗c If the Hamiltonian H diﬀers from H ∗ only by an increment in the direction e2 , the successive application of the renormalization group transformation leads to the ﬁxed point, since |λ2 | < 1 (see Fig. 7.9).

354

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.9. Flow diagram based on the recursion relation (7.3.22), which is linearized around the nontrivial ﬁxed point (FP)

Let us now consider the original nearest-neighbor Ising model with the J coupling constant K0 ≡ kT and with L0 = 0, and ﬁrst determine the critical value Kc ; this is the value of K0 which leads to the ﬁxed point. The condition for Kc , from the above considerations, is given by

1

Kc 1 3 √ = 1 + 0 · e 1 + c2 . (7.3.24) 0 − 10+2 3 9 These two linear equations have the solution c2 =

1 √ , 3( 10 + 2)

and therefore Kc =

1 1 + √ = 0.3979 . (7.3.25) 3 3( 10 + 2)

For K0 = Kc , the linearized RG transformation leads to the ﬁxed point, J i.e. this is the critical point of the nearest-neighbor Ising model, Kc = kT . c From the nonlinear recursion relation (7.3.15a,b), we ﬁnd for the critical point the slighty smaller value Kcn.l. = 0.3921. Both values diﬀer from Onsager’s exact solution, which gives Kc = 0.4406, but they are much closer than the value from molecular ﬁeld theory, Kc = 0.25. For K0 = Kc , only c2 = 0, and the transformation leads to the ﬁxed point. For K0 = Kc , we also have c1 ∝ (K0 − Kc ) = − kTJ 2 (T − Tc ) · · · = 0. c This increases with each application of the RG transformation, and thus leads away from the ﬁxed point (Kc∗ , L∗c ) (Fig. 7.9), so that the ﬂow runs either to the low-temperature ﬁxed point (for T < Tc ) or to the high-temperature ﬁxed point (for T > Tc ). Now we may determine the critical exponent ν for the correlation length, beginning with the recursion relation (K − Kc ) = λ1 (K − Kc ) and writing λ1 as a power of the new length scale √ y1 λ1 = ( 2) . For the exponent y1 deﬁned here, we ﬁnd the value

(7.3.26)

(7.3.27)

7.3 The Renormalization Group

355

log λ1 = 1.566 . (7.3.28) log 2 √ −ν −ν √ From ξ = ξ/ 2 (Eq. (7.3.17)), it follows that (K −Kc) = (K −Kc) / 2, i.e. √ 1 (K − Kc ) = ( 2) ν (K − Kc ) . (7.3.29) y1 = 2

Comparing this with the ﬁrst relation (7.3.26), we obtain ν=

1 = 0.638 . y1

(7.3.30)

This is, to be sure, quite a ways from 1, the known exact value of the twodimensional Ising model, but nevertheless it is larger than 0.5, the value from the molecular-ﬁeld approximation. A considerable improvement can be obtained by extending the recursion relation to several coupling coeﬃcients. Let us now consider the eﬀect of a ﬁnite magnetic ﬁeld h (including the factor β). The recursion relation can again be established intuitively. The ﬁeld h acts directly on the remaining spins, as well as a (somewhat underestimated) additional ﬁeld Kh which is due to the orienting action of the ﬁeld on the eliminated neighboring spins, so that all together we have h = h + Kh .

(7.3.31) ∗

The ﬁxed point value of this recursion relation is h = 0. Linearization around the ﬁxed point yields 4 h; 3 thus the associated eigenvalue is h = (1 + K ∗ )h =

(7.3.32)

4 . (7.3.33) 3 K0 − Kc (or T − Tc ) and h are called the relevant “ﬁelds”, since the eigenvalues λ1 and λh are larger than 1, and they therefore increase as a result of the renormalization group transformation and lead away from the ﬁxed point. In contrast, c2 is an “irrelevant ﬁeld”, since |λ2 | < 1, and therefore c2 becomes increasingly smaller with repeated RG transformations. Here, “ﬁelds” refers to ﬁelds in the usual sense, but also to coupling constants in the Hamiltonian. The structure found here is typical of models which describe critical points, and remains the same even when one takes arbitrarily many coupling constants into account in the transformation: there are two relevant ﬁelds (T − Tc and h, the conjugate ﬁeld to the order parameter), and all the other ﬁelds are irrelevant. λh =

We add a remark concerning the ﬂow diagram 7.9. There, owing to the negative sign of λ2 , only every other point is shown. This corresponds to a twofold application of the transformation and an increase of the lattice constant by a factor of 2, as well as λ1 → λ21 , λ2 → λ22 . Then the second eigenvalue λ22 is also positive, since otherwise the trajectory would move along an oscillatory path towards the ﬁxed point.

356

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.3.4 Scaling Laws Although the decimation procedure described in Sect. 7.3.3 with only a few parameters does not give quantitatively satisfactory results and is also unsuitable for the calculation of correlation functions, it does demonstrate the general structure of RG transformations, which we shall now use as a starting point for deriving the scaling laws. A general RG transformation R maps the original Hamiltonian H onto a new one, H = RH .

(7.3.34)

This transformation also implies the rescaling of all the lengths in the problem, and that N = N b√−d holds for the number of degrees of freedom N in d dimensions (here, b = 2 for the decimation transformation of 7.3.1). The ﬁxed-point Hamiltonian is determined by R(H∗ ) = H∗ .

(7.3.35)

For small deviations from the ﬁxed-point Hamiltonian, R(H∗ + δH) = H∗ + L δH , we can expand in terms of the deviation δH. From the expansion, we obtain the linearized recursion relation LδH = δH .

(7.3.36a)

The eigenoperators δH1 , δH2 , . . . of this linear transformation are determined by the eigenvalue equation LδHi = λi δHi .

(7.3.36b)

A given Hamiltonian H, which diﬀers only slightly from H∗ , can be represented by H∗ and the deviations from it: ci δHi , (7.3.37) H = H∗ + τ δHτ + hδHh + i≥3

where δHτ and δHh denote the two relevant perturbations with |λτ | = byτ > 1 , |λh | = byh > 1 ;

(7.3.38)

c and the external they are related to the temperature variable τ = T −T Tc yj ﬁeld h, while |λj | = b < 1 and thus yj < 0 for j ≥ 3 are connected with the irrelevant perturbations.11 The coeﬃcients τ, h, and cj are called scaling

11

Compare the discussion following Eq. (7.3.33). The (only) irrelevant ﬁeld there is denoted by c2 . In the following, we assume that λi ≥ 0.

7.3 The Renormalization Group

357

ﬁelds. For the Ising model, δHh = l σl . Denoting the initial values of the ﬁelds by ci , we ﬁnd that the free energy transforms after l steps to FN (ci ) = FN/bdl (ci λli ) .

(7.3.39a)

For the free energy per spin, f (ci ) =

1 FN (ci ) , N

(7.3.39b)

we then ﬁnd in the linear approximation f (τ, h, c3 , . . .) = b−dl f τ byτ l , hbyh l , c3 by3 l , . . . .

(7.3.40)

Here, we have left oﬀ an additive term which has no inﬂuence on the following derivation of the scaling law; it is, however, important for the calculation of the free energy. The scaling parameter l can now be chosen in such a way that |τ |byτ l = 1, which makes the ﬁrst argument of f equal to ±1. Then we ﬁnd d/y −y /y |y |/y f (τ, h, c3 , . . .) = |τ | τ fˆ± h|τ | h τ , c3 |τ | 3 τ , . . . , (7.3.40 ) where fˆ± (x, y, . . .) = f (±1, x, y, . . .) and yτ , yh > 0, y3 , . . . < 0. Close to Tc , the dependence on the irrelevant ﬁelds c3 , . . . can be neglected, and Eq. (7.3.40) then takes on precisely the scaling form (Eq. 7.2.7), with the conventional exponents βδ = yh /yτ

(7.3.41a)

and 2−α=

d . yτ

(7.3.41b)

Taking the derivative with respect to h yields β=

d − yh yτ

and γ =

d − 2yh . yτ

(7.3.41c,d)

We have thus derived the scaling law, Eq. (7.2.7), within the RG theory for ﬁxed points with just one relevant ﬁeld, along with the applied magnetic ﬁeld and the irrelevant operators. Furthermore, the dependence on the irrelevant ﬁelds c3 , . . . gives rise to corrections to the scaling laws, which must be taken into account for temperatures outside the asymptotic region. In order to make the connection between yτ and the exponent ν, we recall that l iterations reduce the correlation length to ξ = b−l ξ, which implies that (τ byτ l )−ν = b−l τ −ν and, as a result, ν=

1 yτ

(7.3.41e)

358

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.10. The critical hypersurface. A trajectory within the critical hypersurface is shown as a dashed curve. The full curve is a trajectory near the critical hypersurface. The coupling coeﬃcients of a particular physical system as a function of the temperature are indicated by the long-dashed curve

(cf. Eq. (7.3.30) for the two-dimensional Ising model). From the existence of a ﬁxed-point Hamiltonian with two relevant operators, the scaling form of the free energy can be derived, and it is also possible to calculate the critical exponents. Even the form of the scaling functions fˆ and m ˆ can be computed with perturbation-theoretical methods, since the arguments are ﬁnite. A similar procedure can be applied to the correlation function, Eq. (7.2.12b ). At this point it is important to renormalize the spin variable, σ = bζ σ, whereby it is found that setting the value ζ = (d − 2 + η)/2

(7.3.41f)

guarantees the validity of (7.2.13) at the critical point. We add a few remarks about the generic structure of the ﬂow diagram in the vicinity of a critical ﬁxed point (Fig. 7.10). In the multidimensional space of the coupling coeﬃcients, there is a direction (the relevant direction) which leads away from the ﬁxed point (we assume that h = 0). The other eigenvectors of the linearized RG transformation span the critical hypersurface. Further away from the ﬁxed point, this hypersurface is no longer a plane, but instead is curved. The trajectories from each point on the critical hypersurface lead to the critical ﬁxed point. When the initial point is close to but not precisely on the critical hypersurface, the trajectory at ﬁrst runs parallel to the hypersurface until the relevant portion has become suﬃciently large so that ﬁnally the trajectory leaves the neighborhood of the critical hypersurface and heads oﬀ to either the high-temperature or the low-temperature ﬁxed point. For a given physical system (ferromagnet, liquid, . . .), the parameters τ, c3 , . . . depend on the temperature (the long-dashed curve in Fig. 7.10). The temperature at which this curve intersects the critical hypersurface is the transition temperature Tc . From this discussion, the universality properties should be apparent. All systems which belong to a particular part of the parameter space, i.e. to the region of attraction of a given ﬁxed point, are described by the same power laws in the vicinity of the critical hypersurface of the ﬁxed point.

7.3 The Renormalization Group ∗

359

7.3.5 General RG Transformations in Real Space

A general RG transformation in real space maps a particular spin system {σ} with the Hamiltonian H{σ}, deﬁned on a lattice, onto a new spin system with fewer degrees of freedom (by N /N = b−d ) and a new Hamiltonian H {σ }. It can be represented by a transformation T {σ , σ}, such that T {σ , σ}e−H{σ} (7.3.42) e−G−H {σ } = {σ}

with the conditions H {σ } = 0

(7.3.43a)

{σ }

and

T {σ , σ} = 1 ,

(7.3.43b)

{σ }

which guarantee that

(7.3.44a) e−G Tr {σ } e−H {σ } = Tr {σ} e−H{σ} is fulﬁlled (Tr {σ} ≡ {σ} ). This yields a relation between the free energy F of the original lattice and the free energy F of the primed lattice: F + G = F .

(7.3.44b)

The constant G is independent of the conﬁguration of the {σ } and is determined by equation (7.3.43a). Important examples of such transformations are decimation transformations, as well as linear and nonlinear block-spin transformations. The simplest realization consists of 1 T {σ , σ} = Πi ∈Ω (1 + σi ti (σ)) , 2

(7.3.45)

where Ω denotes the lattice sites of the initial lattice and Ω those of the new lattice, and the function ti (σ) determines the nature of the transformation. α) Decimation Transformation (Fig. 7.11) ti {σ} = ζσi √ b= 2 ,

ζ = b(d−2+η)/2

,

where ζ rescales the amplitude of the remaining spins. Then, σx σ0 = ζ 2 σx σ0 .

(7.3.46a)

360

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.11. A decimation transformation

β) Linear Block-Spin Transformation (on a triangular lattice, Fig.7.12) ti {σ} = p(σi1 + σi2 + σi3 ) √ 1 √ η/2 = 3−1+η/4 . b = 3 , p = ( 3) 3

(7.3.46b)

Fig. 7.12. A block-spin transformation

γ) Nonlinear Block-Spin Transformation ti {σ} = p(σi1 + σi2 + σi3 ) + qσi1 σi2 σi3 .

(7.3.46c)

An important special case p = −q =

1 , 2

σi = sgn(σi1 + σi2 + σi3 ) .

These so called real-space renormalization procedures were introduced by Niemeijer and van Leeuwen12 . The simpliﬁed variant given in Sect. 7.3.3 is from13 . The block-spin transformation for a square Ising lattice is described in14 . For a detailed discussion with additional references, we refer to the article by Niemeijer and van Leeuwen15 . 12 13 14 15

Th. Niemeijer and J. M. J. van Leeuwen, Phys. Rev. Lett. 31, 1411 (1973). K. G. Wilson, Rev. Mod. Phys. 47, 773 (1975). M. Nauenberg and B. Nienhuis, Phys. Rev. Lett. 33, 344 (1974). Th. Niemeijer and J. M. J. van Leeuwen, in Phase Transitions and Critical Phenomena Vol. 6, Eds. C. Domb and M. S. Green, p. 425, Academic Press, London 1976.

∗

∗

7.4 The Ginzburg–Landau Theory

361

7.4 The Ginzburg–Landau Theory

7.4.1 Ginzburg–Landau Functionals The Ginzburg–Landau theory is a continuum description of phase transitions. Experience and the preceding theoretical considerations in this chapter show that the microscopic details such as the lattice structure, the precise form of the interactions, etc. are unimportant for the critical behavior, which manifests itself at distances which are much greater than the lattice constant. Since we are interested only in the behavior at small wavenumbers, we can go to a macroscopic continuum description, roughly analogous to the transition from microscopic electrodynamics to continuum electrodynamics. In setting up the Ginzburg–Landau functional, we will make use of an intuitive justiﬁcation; a microscopic derivation is given in Appendix E. (see also problem 7.15). We start with a ferromagnetic system consisting of Ising spins (n = 1) on a d-dimensional lattice. The generalization to arbitrary dimensions is interesting for several reasons. First, it contains the physically relevant dimensions, three and two. Second, it may be seen that certain approximation methods are exact above four dimensions; this gives us the possibility of carrying out perturbation expansions around the dimension four (Sect. 7.4.5). Instead of the spins Sl on the lattice, we introduce a continuum magnetization 1 m(x) = g(x − xl )Sl . (7.4.1) ˜ ad N 0 l Here, g(x − xl ) is a weighting function, which is equal to one within a cell ˜ spins and is zero outside it. The linear dimension of this cell, ac , is with N supposed to be much larger than the lattice constant a0 but much smaller than the length L of the crystal, i.e. a0 ac L. The function g(x − xl ) is assumed to vary continuously from the value 1 to 0, so that m(x) is a continuous function of x; see Fig. 7.13.

Fig. 7.13. The weighting function g(y) along one of the d cartesian coordinates

Making use of ˜ d dd xg(x − xl ) = Na 0

362

7. Phase Transitions, Renormalization Group Theory, and Percolation

and of the deﬁnition (7.4.1), we can rewrite the Zeeman term as follows: 1 d d xg(x − xl )Sl = dd xhm(x) . hSl = h (7.4.2) ˜ ad N 0 l l From the canonical density matrix for the spins, we obtain the probability density for the conﬁgurations m(x). Generally, we have P[m(x)] = δ m(x) −

1 . g(x − x )S l l ˜ ad N 0 l

(7.4.3)

For P[m(x)], we write P[m(x)] ∝ e−F [m(x)]/kT ,

(7.4.4)

in which the Ginzburg–Landau functional F [m(x)] enters; it is a kind of Hamiltonian for the magnetization m(x). The tendency towards ferromagnetic ordering due to the exchange interaction must express itself in the form of the functional F [m(x)] 2 b F [m(x)] = dd x am2 (x) + m4 (x) + c ∇m(x) − hm(x) . (7.4.5) 2 In the vicinity of Tc , only conﬁgurations of m(x) with small absolute values should be important, and therefore the Taylor expansion (7.4.5) should be allowed. Before we turn to the coeﬃcients in (7.4.5), we make a few remarks about the signiﬁcance of this functional. Due to the averaging (7.4.1), short-wavelength variations of Sl do not contribute to m(x). The long-wavelength variations, however, with wavelengths larger than az , are reﬂected fully in m(x). The partition function of the magnetic system therefore has the form Z = Z0 (T ) D[m(x)]e−F [m(x)]/kT . (7.4.6) Here, the functional integral D[m(x)] . . . refers to a sum over all the possible conﬁgurations of m(x) with the probability density e−F [m(x)]/kT . One can represent m(x) by means of a Fourier series, obtaining the sum over all conﬁgurations by integration over all the Fourier components. The factor Z0 (T ) is due to the (short-wavelength) conﬁgurations of the spin system, which do not contribute to m(x). The evaluation of the functional integral which occurs in the partition function (7.4.6) is of course a highly nontrivial problem and will be carried out in the following Sections 7.4.2 and 7.4.5 using approximation methods. The free energy is F = −kT log Z .

(7.4.7)

∗

7.4 The Ginzburg–Landau Theory

363

We now come to the coeﬃcients in the expansion (7.4.5). First of all, this expansion took into account the fact that F [m(x)] has the same symmetry as the microscopic spin Hamiltonian, i.e. aside from the Zeeman term, F [m(x)] is an even function of m(x). Owing to (7.4.2), the ﬁeld h expresses itself only in the Zeeman term, − dd x h m(x), and the coeﬃcients a, b, c are independent of h. For reasons of stability, large values of m(x) must have a small statistical weight, which requires that b > 0. If for some system b ≤ 0, the expansion must be extended to higher orders in m(x). These circumstances occur in ﬁrst-order phase transitions and at tricritical points. The ferromagnetic exchange interaction has a tendency to orient the spins uniformly. This leads to the term c∇m∇m with c > 0, which suppresses inhomogeneities in the magnetization. Finally, we come to the values of a. For h = 0 and a uniform m(x) = m, the probability weight e−βF is shown in Fig. 7.14.

Fig. 7.14. The probability density e−βF as a function of a uniform magnetization. (a) For a > 0 (T > Tc0 ) and (b) for a < 0 (T < Tc0 )

When a > 0, then the most probable conﬁguration is m = 0; when a < 0, then the most probable conﬁguration is m = 0. Thus, a must change its sign, a = a (T − Tc0 ) ,

(7.4.8)

in order for the phase transition to occur. Due to the nonlinear terms and to ﬂuctuations, the real Tc will diﬀer from Tc0 . The coeﬃcients b and c are ﬁnite at Tc0 . If one starts from a Heisenberg model instead of from an Ising model, the replacements Sl → Sl

and m(x) → m(x) 2 2 m4 (x) → m(x)

,

2

(∇m) → ∇α m∇α m .

(7.4.9)

must be made, leading to Eq. (7.4.10). Ginzburg–Landau functionals can be introduced for every type of phase transition. It is also not necessary to

364

7. Phase Transitions, Renormalization Group Theory, and Percolation

attempt a microscopic derivation: the form is determined in most cases from knowledge of the symmetry of the order parameter. Thus, the Ginzburg– Landau theory was ﬁrst applied to the case of superconductivity long before the advent of the microscopic BCS theory. The Ginzburg–Landau theory was also particularly successful in treating superconductivity, because here simple approximations (see Sect. 7.4.2) are valid even close to the transition (see also Sect. 7.4.4). 7.4.2 The Ginzburg–Landau Approximation We start with the Ginzburg–Landau functional for an order parameter with n components, m(x), n = 1, 2, . . . ,: 1 2 F [m(x)] = dd x am2 (x)+ b(m(x)2 ) + c(∇m)2 − h(x)m(x) . (7.4.10) 2 The integration extends over a volume Ld . The most probable conﬁguration of m(x) is given by the stationary state which is determined by δF 2 = 2 a + bm(x) − c∇2 m(x) − h(x) = 0 . δm(x)

(7.4.11)

Let h be independent of position and let us take h to lie in the x1 -direction without loss of generality, h = he1 , (h ≷ 0); then the uniform solution is found from 2 a + bm2 m − he1 = 0 . (7.4.12) We discuss special cases: (i) h → 0 : spontaneous magnetization and speciﬁc heat When there is no applied ﬁeld, (7.4.12) has the following solutions: m=0

for

a>0

(m = 0) and m = ±e1 m0 ,

m0 =

−a b

(7.4.13) for a < 0 .

The (Gibbs) free energy for the conﬁgurations (7.4.13) is16 F (T, h = 0) = F [0] = 0 F (T, h = 0) = F [m0 ] = − 16

for T > Tc0

(7.4.14a)

for T < Tc0 .

(7.4.14b)

2

1a d L 2 b

R Instead of really computing the functional integral D[m(x)]e−F [m(x)]/kT as is required by (7.4.6) and (7.4.7) for the determination of the free energy, m(x) was replaced everywhere by its most probable value.

∗

7.4 The Ginzburg–Landau Theory

365

We will always leave oﬀ the regular term Freg = −kT log Z0 . The state m = 0 would have a higher free energy for T < Tc0 than the state m0 ; therefore, m = 0 was already put in parentheses in (7.4.13). For T < Tc0 , we thus ﬁnd a ﬁnite spontaneous magnetization. The onset of this magnetization is characterized by the critical exponent β, which here takes on the value β = 12 (Fig. 7.15).

Fig. 7.15. The spontaneous magnetization in the Ginzburg–Landau approximation

Speciﬁc Heat From (7.4.14a,b), we immediately ﬁnd the speciﬁc heat

2 0 ∂S ∂ F d L ch=0 = T = −T = 2 2 ∂T h=0 ∂T h=0 T ab Ld

T > Tc0 , (7.4.15) T < Tc0

with a from (7.4.8). The speciﬁc heat exhibits a jump a , b 2

∆ch=0 = Tc0

(7.4.16)

and the critical exponent α is therefore zero (see Eq. (7.1.1)), α = 0. (ii) The equation of state for h > 0 and the susceptibility We decompose m into a longitudinal part, e1 m1 , and a transverse part, m⊥ = (0, m2 , ..., mn ). Evidently, Eq. (7.4.12) gives m⊥ = 0

(7.4.17)

and the magnetic equation of state h = 2(a + bm21 )m1 .

(7.4.18)

We can simplify this in limiting cases: α) T = Tc0 h = 2bm31

i.e. δ = 3 .

(7.4.19)

β) T > Tc0 m1 =

h + O(h3 ) . 2a

(7.4.20)

366

7. Phase Transitions, Renormalization Group Theory, and Percolation

γ) T < Tc0 m1 = m0 sgn (h) + ∆m yields h + O h2 sgn(h) m1 = m0 sgn (h) + 4bm20 h + O h2 sgn(h) . = m0 sgn(h) + −4a

(7.4.21)

We can now also calculate the magnetic susceptibility for h = 0, either by diﬀerentiating the equation of state (7.4.18) ∂m1 2 a + 3bm21 =1 ∂h or directly, by inspection of (7.4.20) and (7.4.21). It follows that the isothermal susceptibility is given by

1 T > Tc0 ∂m1 χT = = 2a1 . (7.4.22) ∂h T T < Tc0 4|a| The critical exponent γ has, as in molecular ﬁeld theory, a value of γ = 1. 7.4.3 Fluctuations in the Gaussian Approximation 7.4.3.1 Gaussian Approximation Next we want to investigate the inﬂuence of ﬂuctuations of the magnetization. To this end, we ﬁrst expand the Ginzburg–Landau functional in terms of the deviations from the most probable state up to second order m(x) = m1 e1 + m (x) ,

(7.4.23)

where m (x) = L−d/2

mk eikx

(7.4.24)

k∈B

characterizes the deviation from the most probable value. Because of the underlying cell structure, the summation over k is restricted to the Brillouin zone B : − aπc < ki < aπc . The condition that m(x) be real yields m∗k = m−k .

(7.4.25)

A) T > Tc0 and h = 0 : In this region, m1 = 0, and the Fourier series (7.4.24) diagonalizes the harmonic part Fh of the Ginzburg–Landau functional

∗

Fh =

7.4 The Ginzburg–Landau Theory

2 2 dd x am + c(∇m ) = (a + ck 2 )mk m−k .

367

(7.4.26)

k

We can now readily calculate the partition function (7.4.6) in the Gaussian approximation above Tc0 : ZG = Z0 dmk e−βFh . (7.4.27) k

We decompose mk into real and imaginary parts, ﬁnding for each k and each of the n components of mk a Gaussian integral, so that n π (7.4.28) ZG = Z0 β(a + ck 2 ) k

results, and thus the free energy (the stationary solution m1 = 0 makes no contribution) is F (T, 0) = F0 − kT

n π . log 2 β(a + ck 2 )

(7.4.29)

k

The speciﬁc heat, using ch=0 = −T

k

··· =

V (2π)d

∂ 2 F/Ld n 2 = k (T a ) ∂T 2 2

dd k . . . and Eq. (7.4.8), is then dd k

1

2 d (2π) (a + ck 2 )

+ ... .

The dots stand for less singular terms. We deﬁne the quantity c c 1/2 −1/2 = (T − Tc0 ) , ξ= a a

(7.4.30)

(7.4.31)

which diverges in the limit T → Tc0 and will be found to represent the correlation length in the calculation of the correlation function (7.4.47). By introducing q = ξk into (7.4.30) as a new integration variable, we ﬁnd the singular part of the speciﬁc heat ˜ 4−d csing. h=0 = A+ ξ

(7.4.32)

with the amplitude

2 dd q 1 n T a ˜ A+ = k 2 . d 2 c q 4 must be distinguished: d4 Z

„

Λξ

dq Z

0 Λξ

=−

dq 0

q d−1 − q d−5 (1 + q 2 )2

«

Z

Λξ

dq q d−5

+ 0

q d−5 + 2q d−3 1 + (Λξ)d−4 . d−4 (1 + q 2 )2

The overall result is summarized ⎧ 4−d 0 − 2 ⎪ ⎪ ⎨A+ (T − Tc ) 0 csing h=0 = ⎪∼ log(T − Tc ) ⎪ ⎩A − B(T − T 0 ) d−4 2 c

in (7.4.35): d4.

For d ≤ 4, the speciﬁc heat diverges at Tc ; for d > 4, it exhibits a cusp. The amplitude A+ for d < 4 is given by A+ =

d ∞ n 2 a 2 q d−1 T Kd dq 2 . 2 c (1 + q 2 ) 0

(7.4.36)

Below d = 4, the critical exponent of the speciﬁc heat is (ch=0 ∼ (T − Tc ) α=

1 (4 − d) ; 2

−α

)

(7.4.37)

∗

7.4 The Ginzburg–Landau Theory

369

in particular, for d = 3 in the Gaussian approximation, α = 12 . Comparison with exact results and experiments shows that the Gaussian approximation overestimates the ﬂuctuations. B) T < Tc0 Now we turn to the region T < Tc0 and distinguish between the longitudinal (m1 ) and the transverse components (mi ) m1 (x) = m1 + m1 (x) ,

mi (x) = mi (x)

for

i≥2

(7.4.38)

with the Fourier components m1k and mik , where the latter are present only for n ≥ 2. In the present context, including non-integer values of d, vectors will be denoted by just x, etc. From (7.4.10), we ﬁnd for the Ginzburg–Landau functional in second order in the ﬂuctuations: 3h 2 Fh [m] = F [m1 ] + −2a + + O(h2 ) + ck 2 |m1k | 2m1 k (7.4.39) h 2 + + ck 2 |mik | . 2m1 i≥2

To arrive at this expression, the following ancillary calculation was used: ” “ a m21 + 2m1 m1 + m12 + m2⊥ ” b“ 4 + m1 + 4m31 m1 + 6m21 m12 + 2m21 m2⊥ − h(m1 + m1 ) 2 ´ ´ ` ` b = am21 + m41 − hm1 + a + 3bm21 m12 + a + bm21 m2⊥ . 2 | {z } h 2m1

Analogously to the computation leading from (7.4.26) to (7.4.29), we ﬁnd for the free energy of the low-temperature phase at h = 0 F (T, 0) = F0 (T, h) + FG.L. (T, 0)− ﬀ Xj π π 1 + (n − 1) log log . − kT 2 β(2|a| + ck2 ) βck2 k

(7.4.40)

The ﬁrst term results from Z0 ; the second from F [m1 ], the stationary solution considered in the Ginzburg–Landau approximation; the third term from the longitudinal ﬂuctuations; and the fourth from the transverse ﬂuctuations. The latter do not contribute to the speciﬁc heat, since their energy is temperature independent for h = 0:

ch=0

a2 a2 − 4−d + A˜− ξ 4−d = T + A− (Tc − T ) 2 =T b b

where the low-temperature correlation length

,

(7.4.41)

370

7. Phase Transitions, Renormalization Group Theory, and Percolation

) ξ=

2|a| c

*−1 =

c 1/2 −1/2 (Tc0 − T ) , 2a

T < Tc0

(7.4.42)

is to be inserted. The amplitudes in (7.4.23) and (7.4.41) obey the relations 4 A˜− = A˜+ , n

A− =

2d/2 A+ . n

(7.4.43)

The ratio of the amplitudes of the singular contribution to the speciﬁc heat depends only on the number of components n and the spatial dimension d, and is in this sense universal. The transverse ﬂuctuations do not contribute to the speciﬁc heat below Tc ; therefore, the factor n1 enters the amplitude ratio. 7.4.3.2 Correlation Functions We now calculate the correlation functions in the Gaussian approximation. We start by considering T > Tc0 . In order to calculate this type of quantity, with which we shall meet up repeatedly later, we introduce the generating functional P 1 Z[h] = dmk e−βFh + hk m−k ZG k (7.4.44) P 2 2 1 = dmk e−β k (a+ck )|mk | +hk m−k . ZG k

To evaluate the Gaussian integrals in (7.4.44), we introduce the substitution m ˜ k = mk −

1 −1 (a + ck 2 ) hk , 2β

(7.4.45)

obtaining 1 1 Z[h] = exp hk h−k . 4β a + ck 2

(7.4.46)

k

Evidently, mk m

−k

∂2 = Z[h] ∂h−k ∂hk h=0

,

from which we ﬁnd the correlation function by making use of (7.4.46): mk m−k = δkk

1 ≡ δk,k G(k) . 2β(a + ck 2 )

(7.4.47)

∗

7.4 The Ginzburg–Landau Theory

371

Here, we have taken into account the fact that in the sum over k in (7.4.46), each term hk h−k = h−k hk occurs twice. From the last equation, the meaning of the correlation length (7.4.31) becomes clear, since in real space, Eq. (7.4.47) gives 1 eik(x−x ) dd k 1 ik(x−x ) m(x)m(x ) = d = e d L 2β(a + ck 2 ) (2π) 2βc(ξ −2 + k 2 ) k ξ 2−d dd q eiq(x−x )/ξ . = 2βc q 1 between the longitudinal correlation function and the transverse (i ≥ 2) correlation function: G (k) = m1k m1−k and G⊥ (k) = mik mi−k . (7.4.52) For n = 1, only G (k) is relevant. From (7.4.39), it follows in analogy to (7.4.47) that G (k) =

1 2β[−2a +

3h 2m1

h→0

+ ck 2 ]

−→

1 2β[2a (Tc0 − T ) + ck 2 ]

(7.4.53)

and G⊥ (k) =

1 1 h→0 −→ h 2] 2βck 2 2β[ 2m + ck 1

(7.4.54a)

G⊥ (0) =

T m1 . h

(7.4.54b)

372

7. Phase Transitions, Renormalization Group Theory, and Percolation

The divergence of the transverse susceptibility (correlation function) (7.4.54a) at h = 0 is a result of rotational invariance, owing to which it costs no energy to rotate the magnetization. We ﬁrst want to summarize the results of the Gaussian approximation, then treat the limits of its validity, and ﬁnally, in Sect. 7.4.4.1, to discuss the form of the correlation functions below Tc0 in a more general way. In summary for the critical exponents, we have: αFluct = 2 −

1 1 d , β = , γ =1, δ =3, ν = , η =0 2 2 2

(7.4.55)

and for the amplitude ratios of the speciﬁc heat, the longitudinal correlation function and the isothermal susceptibility: A˜+ n = , ˜ 4 A−

C˜+ = 1, C˜−

and

C+ =2. C−

(7.4.56)

The amplitudes are deﬁned in (7.4.32), (7.4.41), (7.4.57), and (7.4.58): G(k) = C˜±

ξ2 1 + (ξk) −1

χ = C± |T − Tc |

1 , C˜± = 2βc

,

2

(7.4.57)

T ≷ Tc .

,

(7.4.58)

7.4.3.3 Range of Validity of the Gaussian Approximation The range of validity of the Gaussian approximation and of more elaborate perturbation-theoretical calculations can be estimated by comparing the higher orders with lower orders. For example, the fourth order must be much smaller than the second, or the Gaussian contribution to the speciﬁc heat must be smaller than the stationary value. The Ginzburg–Landau approximation is permissible if the ﬂuctuations are small compared to the stationary value, i.e. from Eqns. (7.4.16) and (7.4.41),

∆c ξ 4−d

T a c

2 N

,

(7.4.59)

where N is a numerical factor. Then we require that τ (4−d)/2 with τ =

T −Tc0 Tc0

N ξ0d ∆c and ξ0 =

(7.4.60) 9

c a Tc0 .

For dimensions d < 4, the Ginzburg–Landau approximation fails near Tc0 . From (7.4.60), we ﬁnd a characteristic temperature τGL = ( ξdN∆c )2/(4−d) , 0

∗

7.4 The Ginzburg–Landau Theory

373

Table 7.3. The correlation length and the critical region Superconductors17 Magnets λ−Transition

ξ0 ∼ 103 ˚ A ˚ ξ0 ∼ A ˚ ξ0 ∼ 4 A

τGL = 10−10 − 10−14 τGL ∼ 10−2 τGL ∼ 0.3

the so called Ginzburg–Levanyuk temperature; it depends on the Ginzburg– Landau parameters (see Table 7.3). In this connection, dc = 4 appears as a limiting dimension (upper critical dimension). For d < 4, the Ginzburg–Landau approximation fails when τ < τGL . It is then no longer suﬃcient to add the ﬂuctuation contribution; instead, one has to take interactions between the ﬂuctuations into account. Above four dimensions, the corrections to the Gaussian approximation on approaching Tc0 become smaller, so that there, the Gaussian approximation applies. For d > 4, the exponent of the ﬂuctuation contribution is negative, c (T 0 ) from Eq. (7.4.35): αFluct < 0. Then the ratio can be h=0∆c c ≷ 1. 7.4.4 Continuous Symmetry and Phase Transitions of First Order 7.4.4.1 Susceptibilities for T < Tc A) Transverse Susceptibility We found for the transverse correlation function(7.4.54a) that G⊥ (k) = 1 1 and we now want to show that the relation G⊥ (0) = T m is h 2β[ h +ck2 ] 2m1

a general result of rotational invariance. To this end, we imagine that an external ﬁeld h acts on a ferromagnet. Now we investigate the inﬂuence of an additional inﬁnitesimal, transverse ﬁeld δh which is perpendicular to h 9 δh2 + .... (h + δh)2 = h2 + δh2 = h + 2h Thus, the magnitude of the ﬁeld is changed by only O(δh2 ); for a small δh, this is equivalent to a rotation of the ﬁeld through the angle δh h (Fig. 7.16).

Fig. 7.16. The ﬁeld h and the additional, inﬁnitesimal transverse ﬁeld δh 17

F According to BCS theory, ξ0 ∼ 0.18 v . In pure metals, m = me , vF = 108 cm , Tc kTc s is low, ξ0 = 1000 − 16.000 ˚ A. The A-15 compounds Nb3 Sn and V3 Ga have ﬂat , Tc is higher, and ξ0 = 50 ˚ A. The situation bands, so that m is large, vF = 106 cm s A. is diﬀerent in high-Tc superconductors; there, ξ0 ∼ ˚

374

7. Phase Transitions, Renormalization Group Theory, and Percolation

The magnetization rotates through the same angle; this means that and we obtain for the transverse susceptibility, χ⊥ ≡

m δm = . δh h

δm m

=

δh h ,

(7.4.61)

The transverse correlation function in the Gaussian approximation (7.4.54a) is in agreement with this general result. Remarks concerning the spatial dependence of the transverse correlation function G⊥ (r): (i) G⊥ (r, h = 0) =

1 2βc

d−2 dd k eikx ξ⊥ = A , d r (2π)d k 2

ξ⊥ = (2βc)− d−2 1

(7.4.62) d−2

dθdΩd−1 , the integral Employing the volume element dd k = dkk d−1 (sin θ) in (7.4.62) becomes π Ωd−1 ∞ 1 d−2 d−1 dkk eikr cos θ (sin θ) dθ d 2 2βck (2π) 0 0

∞ J d −1 (kr) d 1 d 1 Kd−1 d−3 − Γ 2 2 −1 2 d dk k Γ = −1 2βc 2π 0 2 2 2 (kr) 2 ∼ r−(d−2) . 18 For dimensional reasons, G⊥ (r) must be of the form G⊥ (r) ∼ M 2

d−2 ξ , r

i.e. the transverse correlation length from Eq. (7.4.62) is 2β

ξ⊥ = ξM d−2 ∝ τ −ν τ d−2 = τ ην/(d−2) , 2

(7.4.63)

where the exponent was rearranged using the scaling relations. (ii) We also compute the local transverse ﬂuctuations of the magnetization from −d+2 q 2m1 Λ h cΛ q d−1 dk k d−1 2m1 G⊥ (r = 0) ∼ c ∼ dq h 2 h 1 + q2 0 0 2m + ck 1

18

I.S. Gradshteyn and I.M. Ryzhik, Table of Integrals, Series, and Products (Academic Press, New York 1980), Eq. 8.411.7

∗

7.4 The Ginzburg–Landau Theory

375

and consider the limit h → 0: the result is ﬁnite

for

d>2

log h

for

d=2

for

d < 2.

m 2−d 2 1

h

h→0

−→ ∞ if m1 = 0

For d ≤ 2, the transverse ﬂuctuations diverge in the limit h → 0. As a result, for d ≤ 2, we must have m1 = 0. B) Longitudinal Correlation Function In the Gaussian approximation, we found for T < Tc in Eq. (7.4.54a) that lim lim G (k) =

k→0 h→0

1 −4βa

as for n = 1 . In fact, one would expect that the strong transverse ﬂuctuations would modify the behavior of G (k). Going beyond the Gaussian approximation, we now calculate the contribution of orientation ﬂuctuations to the longitudinal ﬂuctuations. We consider a rotation of the magnetization at the point x and decompose the change δm into a component δm1 parallel and a vector δm⊥ perpendicular to m0 (Fig. 7.17). The invariance of the length yields the condition m20 = m20 + 2m0 δm1 + δm21 + (δm⊥ )2 ; and it follows from this owing to |δm1 | m0 that δm1 = −

1 (δm⊥ )2 . 2m0

(7.4.64)

Fig. 7.17. The rotation of the spontaneous magnetization in isotropic systems

For the correlation of the longitudinal ﬂuctuations, one obtains from this the following relation to the transverse ﬂuctuations: δm1 (x)δm1 (0) =

1 δm⊥ 2 (x)δm⊥ 2 (0) . 2 4m0

(7.4.65)

We now factor this correlation function into the product of two transverse correlation functions, Eq. (7.4.54a), and obtain from it the Fourier-transformed longitudinal correlation function

376

7. Phase Transitions, Renormalization Group Theory, and Percolation

G (k = 0) =

√ m1 d−4

h d e−2r/ h d x (d−2)2 ∼ ∼ h 2 −2 m1 r d

.

(7.4.66)

In three dimensions, we ﬁnd from this for the longitudinal susceptibility kT

1 ∂m1 = G (k = 0) ∼ h− 2 . ∂h

(7.4.66 ) 1

In the vicinity of the critical point Tc , we found m ∼ h δ (see just after Eq. (7.2.4b)); in contrast to (7.4.66), this yields δ−1 ∂m1 ∼ h− δ . ∂h

In isotropic systems, the longitudinal susceptibility is not just singular only in the critical region, but instead in the whole coexistence region for h → 0 (cf. Fig. 7.18). This is a result of rotational invariance.

Fig. 7.18. Singularities in the longitudinal susceptibility in systems with internal rotational symmetry, n ≥ 2

C) Coexistence Singularities The term coexistence region denotes the region of the phase diagram with a ﬁnite magnetization in the limiting case of h → 0. The coexistence singularities found in (7.4.54a), (7.4.62), and (7.4.66) for isotropic systems are exactly valid. This can be shown as follows: for T < Tc0 , the Ginzburg–Landau functional can be written in the form 2

1 |a| |a|2 d 2 2 F [m] = d x b m − + (∇m) − hm − 2 b 2b

1 |a| 2 (7.4.67) = dd x b m21 + 2m1 m1 (x) + m1 (x)2 + m⊥ (x)2 − 2 b 2 2 |a|2 . + c ∇m1 (x) + c ∇m⊥ (x) − h m1 + m1 (x) − 2b In this expression, we have inserted (7.4.38) and have combined the components mi (x), i ≥ 2 into a vector of the transverse ﬂuctuations m⊥ (x) = (0, m2 (x), . . . , mn (x)). Using (7.4.18) and m1 (x) m1 , one obtains

∗

7.4 The Ginzburg–Landau Theory

1 h 2 2 F [m] = dd x b 2m1 m1 + m ⊥ + 2 2bm1 |a|2 2 2 . + c ∇m1 + c ∇m⊥ − h m1 + m1 − 2b

377

(7.4.68)

The terms which are nonlinear in the transverse ﬂuctuations are absorbed into the longitudinal terms by making the substitution m1 = m1 − F [m] = +

m ⊥ : 2m1 2

(7.4.69)

2 2 dd x 2bm21 m1 + c ∇m1

2 h h2 |a|2 2 . m ⊥ + c ∇m⊥ − hm1 + − 2 2m1 8bm1 2b

(7.4.70)

The ﬁnal result for the free energy is harmonic in the variables m1 and m⊥ . As a result, the transverse propagator in the coexistence region is given exactly by (7.4.54a). The longitudinal correlation function is m1 (x)m1 (0)C = m1 (x)m1 (0) +

1 m ⊥ (x)2 m ⊥ (0)2 C . 2 4m1 m 2

(7.4.71)

m 2

In equation (7.4.70), terms of the form (∇ m1⊥ )2 and ∇m1 ∇ m1⊥ have been neglected. The second term in (7.4.69) leads to a reduction of the order parameter m 2⊥ − 2m1 . Eq. (7.4.71) gives the cumulant, i.e. the correlation function of the deviations from the mean value. Since (7.4.70) now contains only harmonic terms, the factorization of the second term in the sum in (7.4.71) is exact, as used in Eq. (7.4.65). One could still raise the objection to the derivation of (7.4.71) that a number of terms were neglected. However, using renormalization group theory19 , it can be shown that the anomalies of the coexistence region are described by a low-temperature ﬁxed point at which m0 = ∞. This means that the result is asymptotically exact. 7.4.4.2 First-Order Phase Transitions There are systems in which not only the transition from one orientation of the order parameter to the opposite direction is of ﬁrst order, but also the transition at Tc . This means that the order parameter jumps at Tc from zero to a ﬁnite value (an example is the ferroelectric transition in BaTiO3 ). This situation can be described in the Ginzburg–Landau theory, if b < 0, 19

I. D. Lawrie, J. Phys. A14, 2489 (1981); ibid., A18, 1141 (1985); U. C. T¨ auber and F. Schwabl, Phys. Rev. B46, 3337 (1992).

378

7. Phase Transitions, Renormalization Group Theory, and Percolation

and if a term of the form 12 vm6 with v > 0 is added for stability. Then the Ginzburg–Landau functional takes the form 0 / 2 1 1 F = dd x am2 + c ∇m + bm4 + vm6 , (7.4.72) 2 2 where a = a (T − Tc0 ). The free energy density is shown in Fig. 7.19 for a uniform order parameter.

Fig. 7.19. The free energy density in the vicinity of a ﬁrst-order phase transition at temperatures T < Tc0 , T ≈ Tc0 , T = Tc , T < T1 , T > T1

For T > T1 , there is only the minimum at m = 0, that is the non-ordered state. At T1 , a second relative minimum appears, which for T ≤ Tc ﬁnally becomes deeper than that at m = 0. For T < Tc0 , the m = 0 state is unstable. The stationarity condition is v a + bm2 + 3 m4 m = 0 , (7.4.73) 2 and the condition that a minimum is present is v 1 ∂2f = a + 3bm2 + 15 m4 > 0 2 ∂m2 2

.

(7.4.74)

The solutions of the stationarity condition are m0 = 0

(7.4.75a)

and m20 = −

b + 3v (−)

b2 2a − 2 9v 3v

1/2 .

(7.4.75b)

We recall that b < 0. The nonzero solution with the minus sign corresponds to a maximum in the free energy and will be left out of further consideration.

∗

7.4 The Ginzburg–Landau Theory

379

The minimum (7.4.75b) exists for all temperatures for which the discriminant is positive, i.e. below the temperature T1 b2 . (7.4.76) 6va T1 is the superheating temperature (see Fig. 7.19 and below). The transition temperature Tc is found from the condition that the free energy for (7.4.75b) is zero. At this temperature (see Fig. 7.19), the free energy has a double zero at m2 = m20 and thus has the form 2 v 2 b v m − m20 m2 = a + m2 + m4 m2 2 2 2

2 b2 v b 2 m + + a m2 = 0 . = − 2 2v 8v T1 = Tc0 +

It follows from this that a = Tc = Tc0 +

b2 8v

b and m2 = − 2v , which both lead to

b2 . 8va

(7.4.77)

For T < Tc0 , there is a local maximum at m = 0. Tc0 plays the role of a supercooling temperature. In the range Tc0 ≤ T ≤ T1 , both phases can thus coexist, i.e. the supercooling or superheating of a phase is possible. Since for Tc0 ≤ T < Tc , the non-ordered phase (m0 = 0) is metastable; for T1 ≥ T > Tc , in contrast, the ordered phase (m0 = 0) is metastable. On slow cooling, so that the system attains the state of lowest free energy, m0 jumps at Tc from 0 to

2 1/2 b b b2 b 2 m0 (Tc ) = − + , (7.4.78) − =− 3v 9v 2 12v 2 2v and, below Tc , it has the temperature dependence (Fig. 7.20) : 0) (T − T 2 3 c . m20 (T ) = m20 (Tc ) 1 + 1 − 3 4 (Tc − Tc0 )

Fig. 7.20. The temperature dependence of the magnetization in a ﬁrstorder phase transition

380 ∗

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.4.5 The Momentum-Shell Renormalization Group

The RG theory can also be carried out in the framework of the G–L functional, with the following advantages compared to discrete spin models: the method is also practicable in higher dimensions, and various interactions and symmetries can be treated. One employs an expansion of the critical exponents in = 4 − d. Here, we cannot go into the details of the necessary perturbation-theoretical techniques, but rather just show the essential structure of the renormalization group recursion relations and their consequences. For the detailed calculation, the reader is referred to more extensive descriptions20,21 and to the literature at the end of this chapter. 7.4.5.1 Wilson’s RG Scheme We now turn to the renormalization group transformation for the Ginzburg– Landau functional (7.4.10). In order to introduce the notation which is usual in this context, we carry out the substitutions √ 1 m = √ φ , a = rc , b = uc2 and h → 2c h , (7.4.79) 2c and obtain the so called Landau–Ginzburg–Wilson functional: r u 1 F [φ] = dd x φ2 + (φ2 )2 + (∇φ)2 − hφ . 2 4 2

(7.4.80)

An intuitively appealing method of proceeding was proposed by K. G. Wilson20,21 . Essentially, the trace over the degrees of freedom with large k in momentum space is evaluated, and one thereby obtains recursion relations for the Ginzburg–Landau coeﬃcients. Since it is to be expected that the detailed form of the short-wavelength ﬂuctuations is not of great importance, the Brillouin zone can be approximated as simply a d-dimensional sphere of radius (cutoﬀ) Λ, Fig. 7.21.

Fig. 7.21. The momentum-space RG: the partial trace is performed over the Fourier components φk with momenta within the shell Λ/b < |k| < Λ 20 21

Wilson, K. G. and Kogut, J., Phys. Rep. 12 C, 76 (1974). S. Ma, Modern Theory of Critical Phenomena, Benjamin, Reading, 1976.

∗

7.4 The Ginzburg–Landau Theory

381

The momentum-shell RG transformation then consists of the following steps: (i) Evaluating the trace over all the Fourier components φk with Λ/b < |k| < Λ (Fig. 7.21) eliminates these short-wavelength modes. (ii) By means of a scale transformation22 k = bk ,

(7.4.81)

φ = bζ φ ,

(7.4.82)

and therefore φk = bζ−d φk ,

(7.4.83)

the resulting eﬀective Hamiltonian functional can be brought into a form resembling the original model, whereby eﬀective scale-dependent coupling parameters are deﬁned. Repeated application of this RG transformation (which represents a semigroup, since it has no inverse element) discloses the presumably universal properties of the long-wavelength regime. As in the realspace renormalization group transformation of Sect. 7.3.3, the ﬁxed points of the transformation correspond to the various thermodynamic phases and the phase transitions between them. The eigenvalues of the linearized ﬂow equations in the vicinity of the critical ﬁxed point ﬁnally yield the critical exponents (see (7.3.41a,b,c)). Although a perturbational expansion (in terms of u) is in no way justiﬁable in the critical region, it is completely legitimate at some distance from the critical point, where the ﬂuctuations are negligible. The important observation is now that the RG ﬂow connects these quite diﬀerent regions, so that the results of the perturbation expansion in the non-critical region can be transported to the vicinity of Tc , whereby the non-analytic singularities are consistently, controllably, and reliably taken into account by this mapping. Perturbation-theoretical methods can likewise be applied in the elimination of the short-wavelength degrees of freedom (step (i)). 7.4.5.2 Gaussian Model We will now apply the concept described in the preceding section ﬁrst to the Gaussian model, where u = 0 (see Sect. 7.4.3), 22

If one considers (7.4.83) together with the ﬁeld term in the Ginzburg–Landau functional (7.4.10), then it can be seen that the exponent ζ determines the transformation of the external ﬁeld and is related to yh from Sect. (7.3.4) via ζ = d − yh .

382

7. Phase Transitions, Renormalization Group Theory, and Percolation

F0 [φk ] =

|k| | < Λ, the result to ﬁrst order in u includes terms of the following (symbolically written) form (from now on, we set kT equal to 1): (i) u φ4< e−F0 must merely be re-exponentiated, since these degrees of freedom are not eliminated; (ii) all terms with an uneven number of φ< or φ> , such as for example u φ3< φ> e−F0 , vanish; (iii) u φ4> e−F0 makes a constant contribution to the free energy and ﬁnally to u φ2< φ2> e−F0 , for which the Gaussian integral over the φ> can be carried out with the aid of Eq. (7.4.47) for the propagator δkk β φα k> φ−k 0 = 2(r+k2 ) , an average value which is calculated with the >

statistical weight e−F0 .

∗

7.4 The Ginzburg–Landau Theory

383

Quite generally, Wick’s theorem20,21 states that expressions of the form m

≡ φk1 > φk2 > . . . φkm > 0

φki >

i

0

factorize into a sum of products of all possible pairs φk> φ−k> 0 if m is even, and otherwise they yield zero. Especially in the treatment of higher orders of perturbation theory, the Feynman diagrams oﬀer a very helpful representation of the large number of contributions which have to be summed in the perturbation expansion. In these diagrams, lines symbolize the propagators and interaction vertices stand for the nonlinear coupling u. With these means at our disposal, we can compute the two-point function φk< φ−k< and the similarly deﬁned four-point function.Using Eq. (7.4.47), one then obtains in the ﬁrst non-trivial order (“1-loop”, a notation which derives from the graphical representation) the following recursion relation between the initial coeﬃcients r, u and the transformed coeﬃcients r , u of the Ginzburg–Landau–Wilson functional20,21 : r = b2 r + (n + 2) A(r) u , (7.4.88) u = b4−d u 1 − (n + 8) C(r) u ,

(7.4.89)

where A(r) and C(r) refer to the integrals

Λ

(k d−1 /r + k 2 )dk

A(r) = Kd Λ/b

=Kd Λd−2 (1 − b2−d )/(d − 2) − rΛd−4 (1 − b4−d )/(d − 4) + O(r2 ) Λ d−1

k /(r + k 2 )2 dk C(r) = Kd Λ/b

= Kd Λ

d−4

(1 − b4−d )/(d − 4) + O(r) ,

with Kd = 1/2d−1π d/2 Γ (d/2), and the factors depending on the number n of components of the order parameter ﬁeld result from the combinatorial analysis in counting the equivalent possibilities for “contracting” the ﬁelds φk> , i.e. for evaluating the integrals over the large momenta. We note that here again, Eq. (7.4.85) applies. Linearizing equations (7.4.88) and (7.4.89) at the Gaussian ﬁxed point r∗ = 0, u∗ = 0, one immediately ﬁnds the eigenvalues yτ = 2 and yu = 4 − d. Then for d > dc = 4, the nonlinearity ∝ u is seen to be irrelevant, and the mean ﬁeld exponents are valid, as already surmised in Sect. 7.4.4. For d < 4 (dc = 4 is the upper critical dimension), the ﬂuctuations however become relevant and each initial value u = 0 increases under the renormalization group transformation. In order to obtain the scaling behavior in this case, we must therefore search for a ﬁnite, non-trivial ﬁxed point. This can be most easily done by introducing a diﬀerential ﬂow, with b = eδ and δ → 0,

384

7. Phase Transitions, Renormalization Group Theory, and Percolation

making the number of RG steps eﬀectively into a continuous variable, and studying the resulting diﬀerential recursion relations: dr(!) = 2r(!) + (n + 2)u(!)Kd Λd−2 − (n + 2)r(!)u(!)Kd Λd−4 , (7.4.90) d! du(!) = (4 − d)u(!) − (n + 8)u(!)2 Kd Λd−4 . (7.4.91) d! Now, a ﬁxed point is deﬁned by the condition dr/d! = 0 = du/d!.

Fig. 7.22. Flow of the eﬀective coupling u("), determined by the right-hand side of Eq. (7.4.91), which is plotted here as the ordinate. Both for initial values u0 > u∗c and 0 < u0 < u∗c , one ﬁnds u(") → u∗c for "→∞

Figure 7.22 shows the ﬂow of u(!) corresponding to Eq. (7.4.91); for any initial value u0 = 0, one ﬁnds that asymptotically, i.e. for ! → ∞, the nontrivial ﬁxed point u∗c Kd =

Λ , n+8

=4−d

(7.4.92)

is approached; this should determine the universal critical properties of the model. As in the real-space renormalization in Sect. 7.3, the RG transformation via momentum-shell elimination generates new interactions; for example, terms ∝ φ6 and ∇2 φ4 , etc., which again inﬂuence the recursion relations for r and u in the succeeding steps. It turns out, however, that up to order 3 , these terms do not have to be taken into account.20,21 The original assumption that u should be small, which justiﬁed the perturbation expansion, now means in light of Eq. (7.4.92) that the eﬀective expansion parameter here is the deviation from the upper critical dimension, . If one inserts (7.4.92) into Eq. (7.4.90) and includes terms up to O(), the result is rc∗ = −

n+2 ∗ (n + 2) 2 uc Kd Λd−2 = − Λ . 2 2(n + 8)

(7.4.93)

The physical interpretation of this result is that ﬂuctuations lead to a lowering of the transition temperature. With τ = r − rc∗ , the diﬀerential form of the ﬂow equation dτ (!) = τ (!) 2 − (n + 2) u Kd Λd−4 d!

(7.4.94)

∗

7.4 The Ginzburg–Landau Theory

385

ﬁnally yields the eigenvalue yτ = 2 − (n + 2)/(n + 8) in the vicinity of the critical point (7.4.92). In the one-loop order which we have described here, O(), one therefore ﬁnds for the critical exponent ν from Eq. (7.3.41e) ν=

1 n+2 + + O(2 ) . 2 4(n + 8)

(7.4.95)

Using the result η = O(2 ) and the scaling relations (7.3.41a–d), one obtains the following expressions (the diﬀerence from the result (7.4.35) of the Gaussian approximation is remarkable) 4−n + O(2 ) , 2(n + 8) 3 1 + O(2 ) , β= − 2 2(n + 8) n+2 + O(2 ) , γ =1+ 2(n + 8)

(7.4.97)

δ = 3 + + O(2 )

(7.4.99)

α=

(7.4.96)

(7.4.98)

to ﬁrst order in the expansion parameter = 4 − d. The ﬁrst non-trivial contribution to the exponent η appears in the two-loop order, η=

n+2 2 + O(3 ) . 2(n + 8)2

(7.4.100)

The universality of these results manifests itself in the fact that they depend only on the spatial dimension d and the number of components n of the order parameter, and not on the original “microscopic” Ginzburg–Landau parameters. Remarks: (i) At the upper critical dimension, dc = 4, an inverse power law is obtained as the solution of Eq. (7.4.91) instead of an exponential behavior, leading to logarithmic corrections to the mean-ﬁeld exponents. (ii) We also mention that for long-range interactions which exhibit powerlaw behavior ∝ |x|−(d+σ) , the critical exponents contain an additional dependence on the parameter σ. (iii) In addition to the -expansion, an expansion in terms of powers of 1/n is also possible. Here, the limit n → ∞ corresponds to the exactly solvable spherical model.23 This 1/n-expansion indeed helps to clarify some general aspects but its numerical accuracy is not very great, since precisely the small values of n are of practical interest. 23

Shang-Keng Ma, Modern Theory of Critical Phenomena, Benjamin, Reading, 1976.

386

7. Phase Transitions, Renormalization Group Theory, and Percolation

The diﬀerential recursion relations of the form (7.4.90) and (7.4.91) also serve as a basis for the treatment of more subtle issues such as the calculation of the scaling functions or the treatment of crossover phenomena within the framework of the RG theory. Thus, for example, an anisotropic perturbation in the n-component Heisenberg model favoring m directions leads to a crossover from the O(n)-Heisenberg ﬁxed point24 to the O(m) ﬁxed point.25 The instability of the former is described by the crossover exponent. For small anisotropic disturbances, to be sure, the ﬂow of the RG trajectory passes very close to the unstable ﬁxed point. This means that one ﬁnds the behavior of an n-component system far from the transition temperature Tc , before the system is ﬁnally dominated by the anisotropic critical behavior. The crossover from one RG ﬁxed point to another can be represented (and measured) by the introduction of eﬀective exponents. These are deﬁned as logarithmic derivatives of suitable physical quantities. Other important perturbations which were treated within the RG theory are, on the one hand, cubic terms. They reﬂect the underlying crystal structure and contribute terms of fourth order in the cartesian components of φ to the Ginzburg– Landau–Wilson functional. On the other hand, dipolar interactions lead to a perturbation which alters the harmonic part of the theory. 7.4.5.4 More Advanced Field-Theoretical Methods If one wishes to discuss perturbation theory in orders higher than the ﬁrst or second, Wilson’s momentum-shell renormalization scheme is not the best choice for practical calculations, in spite of its intuitively appealing properties. The technical reason for this is that the integrals in Fourier space involve nested momenta, which owing to the ﬁnite cutoﬀ wavelength Λ are diﬃcult to evaluate. It is then preferable to use a ﬁeld-theoretical renormalization scheme with Λ → ∞. However, this leads to additional ultraviolet (UV) divergences of the integrals for d ≥ dc . At the critical dimension dc , both ultraviolet and infrared (IR) singularities occur in combination in logarithmic form, [∝ log(Λ2 /r)]. The idea is now to treat the UV divergences with the methods originally developed in quantum ﬁeld theory and thus to arrive at the correct scaling behavior for the IR limit. In the formal implementation, one takes advantage of the fact that the original unrenormalized theory does not depend on the arbitrarily chosen renormalization point; as a consequence, one obtains the Callan–Symanzik- or RG equations. These are partial diﬀerential equations which correspond to the diﬀerential ﬂow equations in the Wilson scheme.

24

25

O(n) indicates invariance with respect to rotations in n-dimensional space, i.e. with respect to the group O(n). See D. J. Amit, Field Theory, the Renormalization Group and Critical Phenomena, 2nd ed., World Scientiﬁc, Singapore, 1984, Chap. 5–3.

∗

7.5 Percolation

387

-expansions have been carried out up to the seventh order;26 the series obtained is however only asymptotically convergent (the convergence radius of the perturbation expansion in u clearly must be zero, since u < 0 corresponds to an unstable theory). The combination of the results from expansions to such a high order with the divergent asymptotic behavior and Borel resummation techniques yields critical exponents with an impressive precision; cf. Table 7.4. Table 7.4. The best estimates for the static critical exponents ν, β, and δ, for the O(n)-symmetric φ4 model in d = 2 and d = 3 dimensions, from -expansions up to high order in connection with Borel summation techniques.26 For comparison, the exact Onsager results for the 2d-Ising model are also shown. The limiting case n = 0 describes the statistical mechanics of polymers. γ n = 0 1.39 ± 0.04 n = 1 1.73 ± 0.06 2d Ising (exact) 1.75 d=2

d=3

∗

n=0 n=1 n=2 n=3

ν

β

η

0.76 ± 0.03 0.99 ± 0.04 1

0.065 ± 0.015 0.120 ± 0.015 0.125

0.21 ± 0.05 0.26 ± 0.05 0.25

1.160 ± 0.004 0.5885 ± 0.0025 0.3025 ± 0.0025 0.031 ± 0.003 1.239 ± 0.004 0.6305 ± 0.0025 0.3265 ± 0.0025 0.037 ± 0.003 1.315 ± 0.007 0.671 ± 0.005 0.3485 ± 0.0035 0.040 ± 0.003 1.390 ± 0.010 0.710 ± 0.007 0.368 ± 0.004 0.040 ± 0.003

7.5 Percolation

Scaling theories and renormalization group theories also play an important role in other branches of physics, whenever the characteristic length tends to inﬁnity and structures occur on every length scale. Examples are percolation in the vicinity of the percolation threshold, polymers in the limit of a large number of monomers, the self-avoiding random walk, growth processes, and driven dissipative systems in the limit of slow growth rates (self-organized criticality). As an example of such a system which can be described in the language of critical phenomena, we will consider percolation. 7.5.1 The Phenomenon of Percolation The phenomenon of percolation refers to problems of the following type: (i) Consider a landscape with hills and valleys, which gradually ﬁlls up with water. When the water level is low, lakes are formed; as the level rises, some of 26

J. C. Le Guillou and J. C. Zinn-Justin, J. Phys. Lett. 46 L, 137 (1985)

388

7. Phase Transitions, Renormalization Group Theory, and Percolation

the lakes join together until ﬁnally at a certain critical level (or critical area) of the water, a sea is formed which stretches from one end of the landscape to the other, with islands. (ii) Consider a surface made of an electrical conductor in which circular holes are punched in a completely random arrangement (Fig. 7.23a). Denoting the fraction of remaining conductor area by p, we ﬁnd for p > pc that there is still an electrical connection from one end of the surface to the other, while for p < pc , the pieces of conducting area are reduced to islands and no longer form continuous bridges, so that the conductivity of this disordered medium is zero. One refers to pc as the percolation threshold. Above pc , there is an inﬁnite “cluster”; below this limit, there are only ﬁnite clusters, whose average radius however diverges on approaching pc . Examples (i) and (ii) represent continuum percolation. Theoretically, one can model such systems on a discrete d-dimensional lattice. In fact, such discrete models also occur in Nature, e.g. in alloys.

Fig. 7.23. Examples of percolation (a) A perforated conductor (Swiss cheese model): continuum percolation; (b) site percolation; (c) bond percolation

(iii) Let us imagine a square lattice in which each site is occupied with a probability p and is unoccupied with the probability (1 − p). ‘Occupied’ can mean in this case that an electrical conductor is placed there and ‘unoccupied’ implies an insulator, or that a magnetic ion or a nonmagnetic ion is present, cf. Fig. 7.23b. Staying with the ﬁrst interpretation, we ﬁnd the following situation: for small p, the conductors form only small islands (electric current can ﬂow only between neighboring sites) and the overall system is an insulator. As p increases, the islands (clusters) of conducting sites get larger. Two lattice sites belong to the same cluster when there is a connection between them via occupied nearest neighbors. For large p (p 1) there are many conducting paths between the opposite edges and the system is a good conductor. At an intermediate concentration pc , the percolation threshold or critical concentration, a connection is just formed, i.e. current can percolate

∗

7.5 Percolation

389

from one edge of the lattice to the other. The critical concentration separates the insulating phase below pc from the conducting phase above pc . In the case of the magnetic example, at pc a ferromagnet is formed from a paramagnet, presuming that the temperature is suﬃciently low. A further example is the occupation of the lattice sites by superconductors or normal conductors, in which case a transition from the normal conducting to the superconducting state takes place. We have considered here some examples of site percolation, in which the lattice sites are stochastically occupied, Fig. 7.23b. Another possibility is that bonds between the lattice sites are stochastically present or are broken. One then refers to bond percolation (cf. Fig. 7.23c). Here, clusters made up of existing bonds occur; two bonds belong to the same cluster if there is a connection between them via existing bonds. Two examples of bond percolation are: (i) a macroscopic system with percolation properties can be produced from a stochastic network of resistors and connecting wires; (ii) a lattice of branched monomers can form bonds between individual monomers with a probability p. For p < pc , ﬁnite macromolecules are formed, and for p > pc , a network of chemical bonds extends over the entire lattice. This gelation process from a solution to a gel state is called the sol-gel transition (example: cooking or “denaturing” of an egg or a pudding); see Fig. 7.23. Remarks: (i) Questions related to percolation are also of importance outside physics, e.g. in biology. An example is the spread of an epidemic or a forest ﬁre. An aﬀected individual can infect a still-healthy neighbor within a given time step, with a probability p. The individual dies after one time step, but the infected neighbors could transmit the disease to other still living, healthy neighbors. Below the critical probability pc , the epidemic dies out after a certain number of time steps; above this probability, it spreads further and further. In the case of a forest ﬁre, one can think of a lattice which is occupied by trees with a probability p. When a tree burns, it ignites the neighboring trees within one time step and is itself reduced to ashes. For small values of p, the ﬁre dies out after several time steps. For p > pc , the ﬁre spreads over the entire forest region, assuming that all the trees along one boundary were ignited. The remains consist of burned-out trees, empty lattice sites, and trees which were separated from their surroundings by a ring of empty sites so that they were never ignited. For p > pc , the burned-out trees form an inﬁnite cluster. (ii) In Nature, disordered systems often occur. Percolation is a simple example of this, in which the occupation of the individual lattice sites is uncorrelated among the sites.

As emphasized above, these models for percolation can also be introduced on a d-dimensional lattice. The higher the spatial dimension, the more possible connected paths there are between sites; therefore, the percolation threshold pc decreases with increasing spatial dimension. The percolation threshold is also smaller for bond percolation than for site percolation, since a bond has more neighboring bonds than a lattice site has neighboring lattice sites (in a square lattice, 6 instead of 4). See Table 7.5.

390

7. Phase Transitions, Renormalization Group Theory, and Percolation

Table 7.5. Percolation thresholds and critical exponents for some lattices Lattice one-dimensional square simple cubic Bethe lattice d = 6 hypercubic d = 7 hypercubic

pc site

bond

1 0.592 0.311

1 1/2 0.248

1 z−1

1 z−1

0.107 0.089

0.0942 0.0787

β

ν

–

1

1

5 36

4 3

43 18

0.417 1 1 1

0.875 1

1.795 1 1 1

1 2 1 2

γ

The percolation transition, in contrast to thermal phase transitions, has a geometric nature. When p increases towards pc , the clusters become larger and larger; at pc , an inﬁnite cluster is formed. Although this cluster already extends over the entire area, the fraction of sites which it contains is still zero at pc . For p > pc , more and more sites join the inﬁnite cluster at the expense of the ﬁnite clusters, whose average radii decrease. For p = 1, all sites naturally belong to the inﬁnite cluster. The behavior in the vicinity of pc exhibits many similarities to critical behavior in second-order phase transitions in the neighborhood of the critical temperature Tc . As discussed β in Sect. 7.1, the magnetization increases below Tc as M ∼ (Tc − T ) . In the case of percolation, the quantity corresponding to the order parameter is the probability P∞ that an occupied site (or an existing bond) belongs to the inﬁnite cluster, Fig. (7.24). Accordingly, 0 for p < pc P∞ ∝ (7.5.1) β (p − pc ) for p > pc .

Fig. 7.24. P∞ : order parameter (the strength of the inﬁnite clusters); S: average number of sites in a ﬁnite cluster

The correlation length ξ characterizes the linear dimension of the ﬁnite clusters (above and below pc ). More precisely, it is deﬁned as the average distance between two occupied lattice sites in the same ﬁnite cluster. In the vicinity

∗

7.5 Percolation

391

of pc , ξ behaves as ξ ∼ |p − pc |−ν .

(7.5.2)

A further variable is the average number of sites (bonds) in a ﬁnite cluster. It diverges as S ∼ |p − pc |−γ

(7.5.3)

and corresponds to the magnetic susceptibility χ; cf. Fig. (7.24). Just as in a thermal phase transition, one expects that the critical properties (e.g. the values of β, ν, γ) are universal, i.e. that they do not depend on the lattice structure or the kind of percolation (site, bond, continuum percolation). These critical properties do, however, depend on the spatial dimension of the system. The values of the exponents are collected in Table 7.5 for several diﬀerent lattices. One can map the percolation problem onto an s-state-Potts model, whereby the limit s → 1 is to be taken.27,28 From this relation, it is understandable that the upper critical dimension for percolation is dc = 6. The Potts model in its ﬁeld-theoretical Ginzburg–Landau formulation contains a term of the form φ3 ; from it, following considerations analogous to the φ4 theory, the characteristic dimension dc = 6 is derived. The critical exponents β, ν, γ describe the geometric properties of the percolation transition. Furthermore, there are also dynamic exponents, which describe the transport properties such as the electrical conductivity of the perforated circuit board or of the disordered resistance network. Also the magnetic thermodynamic transitions in the vicinity of the percolation threshold can be investigated. 7.5.2 Theoretical Description of Percolation We consider clusters of size s, i.e. clusters containing s sites. We denote the number of such s-clusters divided by the number of all lattice sites by ns , and call this the (normalized) cluster number. Then s ns is the probability that an arbitrarily chosen site will belong to a cluster of size s. Below the percolation threshold (p < pc ), we have ∞

s ns =

s=1

number of all the occupied sites =p. total number of lattice sites

The number of clusters per lattice site, irrespective of their size, is Nc = ns .

(7.5.4)

(7.5.5)

s 27 28

C. M. Fortuin and P. W. Kasteleyn, Physica 57, 536 (1972). The s-state-Potts model is deﬁned as a generalization of the Ising model, which corresponds to the 2-state-Potts model: at each lattice site there are s states Z. The energy contribution of a pair is −JδZ,Z , i.e. −J if both lattice sites are in the same state, and otherwise zero.

392

7. Phase Transitions, Renormalization Group Theory, and Percolation

The average size (and also the average mass) of all ﬁnite clusters is S=

∞

∞

s ns 1 2 s ∞ = s ns . p s=1 s=1 s ns s=1

(7.5.6)

The following relation holds between the quantity P∞ deﬁned before (7.5.1) and ns : we consider an arbitrary lattice site. It is either empty or occupied and belongs to a cluster of ﬁnite size, ∞ or it is occupied and belongs to the inﬁnite cluster, that is 1 = 1 − p + s=1 s ns + p P∞ , and therefore P∞ = 1 −

1 s ns . p s

(7.5.7)

7.5.3 Percolation in One Dimension We consider a one-dimensional chain in which every lattice site is occupied with the probability p. Since a single unoccupied site will interrupt the connection to the other end, i.e. an inﬁnite cluster can be present only when all sites are occupied, we have pc = 1. In this model we can thus study only the phase p < pc . We can immediately compute the normalized number of clusters ns for this model. The probability that an arbitrarily chosen site belongs to a clus2 ter of size s has the value s p s (1 − p) , since a series of s sites must be s occupied (factor p ) and the sites at the left and right boundaries must be unoccupied (factor (1 − p)2 ). Since the chosen site could be at any of the s locations within the clusters, the factor s occurs. From this and from the general considerations at the beginning of Sect. 7.5.2, it follows that: 2

ns = ps (1 − p) .

(7.5.8)

With this expression and starting from (7.5.6), we can calculate the average cluster size: 2 ∞ ∞ 2

1 2 d 1 2 s (1 − p) 2 S= p s p (1 − p) = ps s ns = p p s=1 p dp s=1 (7.5.9)

2 d 1+p (1 − p)2 p p = for p < pc . = p dp 1 − p 1−p The average cluster size diverges on approaching the percolation threshold pc = 1 as 1/(1 − p), i.e. in one dimension, the exponent introduced in (7.5.3) is γ = 1. We now deﬁne the radial correlation function g(r). Let the zero point be an occupied site; then g(r) gives the average number of occupied sites at a distance r which belong to the same cluster as the zero point. This is also equal to the probability that a particular site at the distance r is occupied and

∗

7.5 Percolation

393

belongs to the same cluster, multiplied by the number of sites at a distance r. Clearly, g(0) = 1. For a point to belong to the cluster requires that this point itself and all points lying between 0 and r be occupied, that is, the probability that the point r is occupied and belongs to the same cluster as 0 is pr , and therefore we ﬁnd g(r) = 2 pr

r≥1.

for

(7.5.10)

The factor of 2 is required because in a one-dimensional lattice there are two points at a distance r. The correlation length is deﬁned by ∞ 2 r ∞ 2 r=1 r g(r) r=1 r p ξ2 = = . (7.5.11) ∞ ∞ r r=1 g(r) r=1 p Analogously to the calculation in Eq. (7.5.9), one obtains ξ2 =

1+p 2

(1 − p)

=

1+p

,

2

(p − pc )

(7.5.11 )

i.e. here, the critical exponent of the correlation length is ν = 1. We can also write g(r) in the form g(r) = 2 er log p = 2 e−

√

2r ξ

,

(7.5.10 )

where after the last equals sign, we have taken p ≈ pc , so that log p = log(1 − (1 − p)) ≈ −(1 − p). The correlation length characterizes the (exponential) decay of the correlation function. The average cluster size previously introduced can also be represented in terms of the radial correlation function S =1+

∞

g(r) .

(7.5.12)

r=1

We recall the analogous relation between the static susceptibility and the correlation function, which was derived in the chapter on ferromagnetism, Eq. (6.5.42). One can readily convince oneself that (7.5.12) together with (7.5.10) again leads to (7.5.9). 7.5.4 The Bethe Lattice (Cayley Tree) A further exactly solvable model, which has the advantage over the onedimensional model that it is deﬁned also in the phase region p > pc , is percolation on a Bethe lattice. The Bethe lattice is constructed as follows: from the lattice site at the origin, z (coordination number) branches spread out, at whose ends again lattice sites are located, from each of which again z − 1 new branches emerge, etc. (see Fig. 7.25 for z = 3).

394

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.25. A Bethe lattice with the coordination number z = 3

The ﬁrst shell of lattice sites contains z sites, the second shell contains l−1 z(z − 1) sites, and the lth shell contains z(z − 1) sites. The number of lattice sites increases exponentially with the distance from the center point ∼ el log(z−1) , while in a d-dimensional Euclidean lattice, this number increases as ld−1 . This suggests that the critical exponents of the Bethe lattice would be the same as those of a usual Euclidean lattice for d → ∞. Another particular diﬀerence between the Bethe lattice and Euclidean lattices is the property that it contains only branches but no closed loops. This is the reason for its exact solvability. To start with, we calculate the radial correlation function g(l), which as before is deﬁned as the average number of occupied lattice sites within the same cluster at a distance l from an arbitrary occupied lattice site. The probability that a particular lattice site at the distance l is occupied as well as all those between it and the origin has the value pl . The number of all the l−1 sites in the shell l is z(z − 1) ; from this it follows that: l−1

g(l) = z(z − 1)

pl =

z z l log(p(z−1)) (p(z − 1))l = e . z−1 z−1

(7.5.13)

From the behavior of the correlation function for large l, one can read oﬀ the percolation threshold for the Bethe lattice. For p(z − 1) < 1, there is an exponential decrease, and for p(z − 1) > 1, g(l) diverges for l → ∞ and there is an inﬁnite cluster, which must not be included in calculating the correlation function of the ﬁnite clusters. It follows from (7.5.13) for pc that pc =

1 . z−1

(7.5.14)

For z = 2, the Bethe lattice becomes a one-dimensional chain, and thus pc = 1. From (7.5.13) it is evident that the correlation length is ξ∝

−1 −1 1 = ∼ log [p(z − 1)] log ppc pc − p

(7.5.15)

∗

7.5 Percolation

395

for p in the vicinity of pc , i.e. ν = 1, as in one dimension29 . The same result is found if one deﬁnes ξ by means of (7.5.11). For the average cluster size one ﬁnds for p < pc S =1+

∞

g(l) =

l=1

pc (1 + p) pc − p

for p < pc ;

(7.5.16)

i.e. γ = 1. The strength of the inﬁnite cluster P∞ , i.e. the probability that an arbitrary occupied lattice site belongs to the inﬁnite cluster, can be calculated in the following manner: the product pP∞ is the probability that the origin or some other point is occupied and that a connection between occupied sites up to inﬁnity exists. We ﬁrst compute the probability Q that an arbitrary site is not connected to inﬁnity via a particular branch originating from it. This is equal to the probability that the site at the end of the branch is not occupied, that is (1 − p) plus the probability that this site is occupied but that none of the z − 1 branches which lead out from it connects to ∞, i.e. Q = 1 − p + p Qz−1 . This is a determining equation for Q, which we shall solve for simplicity for a coordination number z = 3. The two solutions of the quadratic equation are Q = 1 and Q = 1−p p . The probability that the origin is occupied, that however no path leads to inﬁnity, is on the one hand p(1 − P∞ ) and on the other p Qz , i.e. for z = 3: P∞ = 1 − Q3 . For the ﬁrst solution, Q = 1, we obtain P∞ = 0, obviously relevant for p < pc ; and for the second solution 3

1−p P∞ = 1 − , (7.5.17) p for p > pc . In the vicinity of pc = 12 , the strength of the inﬁnite clusters varies as P∞ ∝ (p − pc ) ,

(7.5.18)

that is β = 1. We will also obtain this result with Eq. (7.5.30) in a diﬀerent manner. 29

Earlier, it was speculated that hypercubic lattices of high spatial dimension have the same critical exponents as the Bethe lattice. The visible diﬀerence in ν seen in Table 7.5 is due to the fact that in the Bethe lattice, the topological (chemical) and in the hypercubic lattice the Euclidean distance was used. If one uses the chemical distance for the hypercubic lattice also, above d = 6, ν = 1 is likewise obtained. See Literature: A. Bunde and S. Havlin, p. 71.

396

7. Phase Transitions, Renormalization Group Theory, and Percolation

Now we will investigate the normalized cluster number ns , which is also equal to the probability that a particular site belongs to a cluster of size s, divided by s. In one dimension, ns could readily be determined. In general, the probability for a cluster with s sites and t (empty) boundary points is t ps (1 − p) . The perimeter t includes external and internal boundary points of the cluster. For general lattices, such as e.g. the square lattice, there are various values of t belonging to one and the same value of s, depending on the shape of the cluster; the more stretched out the cluster, the larger is t, and the more nearly spherical the cluster, the smaller is t. In a square lattice, there are two clusters having the size 3, a linear and a bent cluster. The associated values of t are 8 and 7, and the number of orientations on the lattice are 2 and 4. For general lattices, the quantity gst must therefore be introduced; it gives the number of clusters of size s and boundary t. Then the general expression for ns is t gst ps (1 − p) . (7.5.19) ns = t

For arbitrary lattices, a determination of gst is in general not possible. For the Bethe lattice, there is however a unique connection between the size s of the cluster and the number of its boundary points t. A cluster of size 1 has t = z, and a cluster of s = 2 has t = 2z − 2. In general, a cluster of size s has z − 2 more boundary points than a cluster of size s − 1, i.e. t(s) = z + (s − 1)(z − 2) = 2 + s(z − 2) . Thus, for the Bethe lattice, 2+(z−2)s ns = gs ps 1 − p ,

(7.5.20)

where gs is the number of conﬁgurations of clusters of the size s. In order to avoid the calculation of gs , we will refer ns (p) to the distribution ns (pc ) at pc . We now wish to investigate the behavior of ns in the vicinity of pc = −1 (z − 1) as a function of the cluster size, and separate oﬀ the distribution at pc , z−2 s

2 p (1 − p) 1−p ns (p) = ns (pc ) ; (7.5.21) 1 − pc pc (1 − pc ) we then expand around p = pc 2 (p − pc )2 1−p + O (p − pc )3 1− ns (p) = ns (pc ) 2 1 − pc 2 pc (1 − pc ) −c s

s

(7.5.22)

= ns (pc ) e , 2 2 c) ∝ (p − pc ) . with c = − log 1 − 2p(p−p c (1−pc ) This means that the number of clusters of size s decreases exponentially. 1 The second factor in (7.5.22) depends only on the combination (p − pc ) σ s,

∗

7.5 Percolation

397

with σ = 1/2. The exponent σ determines how rapidly the number of clusters decreases with increasing size s. At pc , the s-dependence of ns arises only from the prefactor ns (pc ). In analogy to critical points, we assume that ns (pc ) is a pure power law; in the case that ξ gives the only length scale, which is inﬁnite at pc ; then at pc there can be no characteristic lengths, cluster sizes, etc. That is, ns (pc ) can have only the form ns (pc ) ∼ s−τ .

(7.5.23)

The complete function (7.5.22) is then of the form 1 ns (p) = s−τ f (p − pc ) σ s ,

(7.5.24)

and it is a homogeneous function of s and (p−pc ). We can relate the exponent τ to already known exponents: the average cluster size is, from Eq. (7.5.6), 1 2 s ns (p) ∝ s2−τ e−cs p s ∞ ∞ 2−τ −cs τ −3 ∝ ds s e =c z 2−τ e−z dz .

S=

1

(7.5.25)

c

For τ < 3, the integral exists, even when its lower limit goes to zero: it is then S ∼ cτ −3 = (p − pc )

τ −3 σ

,

(7.5.26)

from which, according to (7.5.3), it follows that γ=

3−τ . σ

(7.5.27)

Since for the Bethe lattice, γ = 1 and σ = 12 , we ﬁnd τ = 52 . From (7.5.24) using the general relation (7.5.7) one can also determine P∞ . While the factor s2 in (7.5.25) was suﬃcient to make the integral converge at its lower limit, this is not the case in (7.5.7). Therefore, we ﬁrst write (7.5.7) in the form 1 s ns (p) − ns (pc ) − p s 1 = s ns (pc ) − ns (p) + 1 − p s

P∞ = 1 −

where P∞ (pc ) = 0 = 1 −

1 s ns (pc ) pc s

1 s ns (pc ) p s pc , p

(7.5.28)

398

7. Phase Transitions, Renormalization Group Theory, and Percolation

has been used. Now the ﬁrst term in (7.5.28) can be replaced by an integral ∞

p − pc P∞ = const. cτ −2 z 1−τ 1 − e−z dz + p c (7.5.29) p − pc τ −2 = ...c + . p From this, we ﬁnd for the exponent deﬁned in Eq. (7.5.1) β=

τ −2 . σ

(7.5.30)

For the Bethe lattice, one ﬁnds once again β = 1, in agreement with (7.5.18). In the Bethe lattice, the ﬁrst term in (7.5.29)) also has the form p − pc , while in other lattices, the ﬁrst term, (p − pc )β , predominates relative to the second due to β < 1. In (7.5.5), we also introduced the average number of clusters per lattice site, whose critical percolation behavior is characterized by an exponent α via Nc ≡ ns ∼ |p − pc |2−α . (7.5.31) s

That is, this quantity plays an analogous role to that of the free energy in thermal phase transitions. We note that in the case of percolation there are no interactions, and the free energy is determined merely by the entropy. Again inserting (7.5.24) for the cluster number into (7.5.31), we ﬁnd 2−α=

τ −1 , σ

(7.5.32)

which leads to α = −1 for the Bethe lattice. In summary, the critical exponents for the Bethe lattice are β = 1 , γ = 1 , α = −1 , ν = 1 , τ = 5/2 , σ = 1/2 .

(7.5.33)

7.5.5 General Scaling Theory In the preceding section, the exponents for the Bethe lattice (Cayley tree) were calculated. In the process, we made some use of a scaling assumption (7.5.24). We will now generalize that assumption and derive the consequences which follow from it. We start with the general scaling hypothesis 1 ns (p) = s−τ f± |p − pc | σ s , (7.5.34)

∗

7.5 Percolation

399

where ± refers to p ≷ pc .30 The relations (7.5.27), (7.5.30), and (7.5.32), which contain only the exponents α, β, γ, σ, τ , also hold for the general scaling hypothesis. The scaling relation for the correlation length and other characteristics of the extension of the ﬁnite clusters must be derived once more. The correlation length is the root mean square distance between all the occupied sites within the same ﬁnite cluster. For a cluster with s occupied sites, the root mean square distance between all pairs is Rs2 =

i s 1 (xi − xj )2 . s2 i=1 j=1

The correlation length ξ is obtained by averaging over all clusters ∞ 2 2 2 s=1 Rs s ns ξ = . ∞ 2 s=1 s ns

(7.5.35)

The quantity 12 s2 ns is equal to the number of pairs in clusters ns of size s, i.e. proportional to the probability that a pair (in the same cluster) belongs to a cluster of the size s. The mean square cluster radius is given by ∞ 2 s=1 Rs s ns 2 R = , (7.5.36) ∞ s=1 s ns since s ns = the probability that an occupied site belongs to an s-cluster. The mean square distance increases with cluster size according to Rs ∼ s1/df ,

(7.5.37)

where df is the fractal dimension. Then it follows from (7.5.35) that ξ2 ∼

∞

2

s df

+2−τ

∞ ; 1 1 f± |p − pc | σ s s2−τ f± |p − pc | σ s

s=1

s=1 − d 2σ

∼ |p − pc | f , 2 < τ < 2.5 τ −1 1 = , ν= df σ dσ and from (7.5.36), R2 ∼

∞

2

s df

+1−τ

1

−2ν+β

f± (|p − pc | σ s) ∼ |p − pc |

.

s=1 30

At the percolation threshold p = pc , the distribution of clusters is a power law ns (pc ) = s−τ f± (0). The cutoﬀ function f± (x) goes to zero for x 1, for example as in (7.5.22) exponentially. The quantity smax = |p−pc |−1/σ refers to the largest cluster. Clusters of size s smax are also distributed according to s−τ for p = pc , and for s smax , ns (p) vanishes.

400

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.5.5.1 The Duality Transformation and the Percolation Threshold The computation of pc for bond percolation on a square lattice can be carried out by making use of a duality transformation. The deﬁnition of the dual lattice is illustrated in Fig. 7.26. The lattice points of the dual lattice are deﬁned by the centers of the unit cells of the lattice. A bond in the dual lattice is placed wherever it does not cross a bond of the lattice; i.e. the probability for a bond in the dual lattice is q =1−p. In the dual lattice, there is likewise a bond percolation problem. For p < pc , there is no inﬁnite cluster on the lattice, however there is an inﬁnite cluster on the dual lattice. There is a path from one end of the dual lattice to the other which cuts no bonds on the lattice; thus q > pc . For p → p− c from below, q → p+ arrives at the percolation threshold from above, i.e. c pc = 1 − pc . Thus, pc = 12 . This result is exact for bond percolation.

Fig. 7.26. A lattice and its dual lattice. Left side: A lattice with bonds and the dual lattice. Right side: Showing also the bonds in the dual lattice

Remarks: (i) By means of similar considerations, one ﬁnds also that the percolation threshold for site percolation on a triangular lattice is given by pc = 12 . (ii) For the two-dimensional Ising model, also, the transition temperatures for a series of lattice structures were already known from duality transformations before its exact solution had been achieved.

7.5.6 Real-Space Renormalization Group Theory We now discuss a real-space renormalization-group transformation, which allows the approximate determination of pc and the critical exponents. In the decimation transformation shown in Fig. 7.27 for a square lattice, every other lattice site is eliminated; this leads again to a square lattice. In

∗

7.5 Percolation

401

Fig. 7.27. A lattice and a decimated lattice

Fig. 7.28. Bond conﬁgurations which lead to a bond (dashed) on the decimated lattice

the new lattice, a bond is placed between two remaining sites if at least one connection via two bonds existed on the original lattice (see Fig. 7.27). The bond conﬁgurations which lead to formation of a bond (shown as dashed lines) in the decimated lattice are indicated in Fig. 7.28. Below, the probability for these conﬁgurations is given. From the rules shown in Fig. 7.28, we ﬁnd for the probability for the existence of a bond on the decimated lattice p = p4 + 4p3 (1 − p) + 2p2 (1 − p) = 2p2 − p4 . 2

(7.5.38)

From this transformation law31 , one obtains the ﬁxed-point equation p∗ = 2p∗ 2 − p∗ 4 . It has the solutions p∗ = 0 , p∗ = 1, which correspond to the highand low-temperature ﬁxed points for phase transitions; and in addition, the + √5 √ −1(−) two ﬁxed points p∗ = , of which only p∗ = 5−1 = 0.618 . . . is 2 2 acceptable. This value of the percolation threshold diﬀers from the exact value found in the preceding section, 12 . The reasons for this are: (i) sites which were connected on the original lattice may not be connected on the decimated lattice; (ii) diﬀerent bonds on the decimated lattice are no longer uncorrelated, since the existence of a bond on the original lattice can be responsible for the occurrence of several bonds on the decimated lattice. The linearization of the recursion relation around the ﬁxed point yields ν = 0.817 for the exponent of the correlation length. The treatment of site percolation on a triangular lattice in two dimensions is most simple. The lattice points of a triangle are combined into a cell. This cell is counted as occupied if all three sites are occupied, or if two sites are occupied and one is empty, since in both cases there is a path through the cell. For all other conﬁgurations (only one site occupied or none occupied), the cell is unoccupied. For the 31

A. P. Young and R. B. Stinchcombe, J. Phys. C: Solid State Phys. 8, L 535 (1975).

402

7. Phase Transitions, Renormalization Group Theory, and Percolation

triangular lattice32 , one thus obtains as the recursion relation p = p3 + 3p2 (1 − p) , ∗

(7.5.39) 1 . 2

This RG transformation thus yields pc = 12 for with the ﬁxed points p = 0, 1, the percolation threshold, which is identical with the exact value (see remark (i) above). The linearization of the RG transformation around the ﬁxed point yields the following result for the exponent ν of the correlation length: √ log 3 = 1.3547 . ν= log 32 This is nearer to the result obtained by series expansion, ν = 1.34, as well as to the exact result, 4/3, than the result for the square lattice (see the remark on universality following Eq. (7.5.3)).

7.5.6.1 Deﬁnition of the Fractal Dimension In a fractal object, the mass behaves as a function of the length L of a d-dimensional Euclidean section as M (L) ∼ Ldf , and thus the density is ρ(L) =

M (L) ∼ Ldf −d . Ld

An alternative deﬁnition of df is obtained from the number of hypercubes N (Lm , δ) which one requires to cover the fractal structure. We take the side length of the hypercubes to be δ, and the hypercube which contains the whole cluster to have the side length Lm : «d „ Lm f N (Lm , δ) = , δ i.e. df = − lim

δ→0

log N (Lm , δ) . log δ

Literature D. J. Amit, Field Theory, the Renormalization Group, and Critical Phenomena, 2nd ed., World Scientiﬁc, Singapore 1984 P. Bak, C. Tang, and K. Wiesenfeld, Phys. Rev. Lett. 59, 381 (1987) K. Binder, Rep. Progr. Phys. 60, 487 (1997) 32

P. J. Reynolds, W. Klein, and H. E. Stanley, J. Phys. C: Solid State Phys. 10 L167 (1977).

∗

7.5 Percolation

403

J. J. Binney, N. J. Dowrick, A. J. Fisher, and M. E. J. Newman, The Theory of Critical Phenomena, 2nd ed., Oxford University Press, New York 1993 M. J. Buckingham and W. M. Fairbank, in: C. J. Gorter (Ed.), Progress in Low Temperature Physics, Vol. III, 80–112, North Holland Publishing Company, Amsterdam 1961 A. Bunde and S. Havlin, in: A. Bunde, S. Havlin (Eds.), Fractals and Disordered Systems, 51, Springer, Berlin 1991 Critical Phenomena, Lecture Notes in Physics 54, Ed. J. Brey and R. B. Jones, Springer, Sitges, Barcelona 1976 M. C. Cross and P. C. Hohenberg (1994), Rev. Mod. Phys. 65, 851–1112 P. G. De Gennes, Scaling Concepts in Polymer Physics, Cornell University Press, Ithaca, NY 1979 C. Domb and M. S. Green, Phase Transitions and Critical Phenomena, Academic Press, London 1972-1976 C. Domb and J. L. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, Vols. 7–15, Academic Press, London 1983–1992 B. Drossel and F. Schwabl, Phys. Rev. Lett. 69, 1629 (1992) J. W. Essam, Rep. Prog. Phys. 43, 843 (1980) R. A. Ferrell, N. Menyh´ ard, H. Schmidt, F. Schwabl, and P. Sz´epfalusy, Ann. Phys. (New York) 47, 565 (1968) M. E. Fisher, Rep. Prog. Phys. 30, 615–730 (1967) M. E. Fisher, Rev. Mod. Phys. 46, 597 (1974) E. Frey and F. Schwabl, Adv. Phys. 43, 577-683 (1994) B. I. Halperin and P. C. Hohenberg, Phys. Rev. 177, 952 (1969) H. J. Jensen, Self-Organized Criticality, Cambridge University Press, Cambridge 1998 Shang-Keng Ma, Modern Theory of Critical Phenomena, Benjamin, Reading, Mass. 1976 S. Ma, in: C. Domb and M. S. Green (Eds.), Phase Transitions and Critical Phenomena, Vol. 6, 249–292, Academic Press, London 1976 T. Niemeijer and J. M. J. van Leeuwen, in: C. Domb and M. S. Green (Eds.), Phase Transitions and Critical Phenomena, Vol. 6, 425–505, Academic Press, London 1976 G. Parisi, Statistical Field Theory, Addison–Wesley, Redwood 1988 A. Z. Patashinskii and V. L. Prokovskii, Fluctuation theory of Phase Transitions, Pergamon Press, Oxford 1979 P. Pfeuty and G. Toulouse, Introduction to the Renormalization Group and to Critical Phenomena, John Wiley, London 1977 C. N. R. Rao and K. J. Rao, Phase Transitions in Solids, McGraw Hill, New York 1978 F. Schwabl and U. C. T¨ auber, Phase Transitions: Renormalization and Scaling, in Encyclopedia of Applied Physics, Vol. 13, 343, VCH (1995) H. E. Stanley, Introduction to Phase Transitions and Critical Phenomena, Clarendon Press, Oxford 1971 D. Stauﬀer and A. Aharony, Introduction to Percolation Theory, Taylor and Francis, London and Philadelphia 1985

404

7. Phase Transitions, Renormalization Group Theory, and Percolation

J. M. J. van Leeuwen in Fundamental Problems in Statistical Mechanics III, Ed. E. G. D. Cohen, North Holland Publishing Company, Amsterdam 1975 K. G. Wilson and J. Kogut, Phys. Rept. 12C, 76 (1974) K. G. Wilson, Rev. Mod. Phys. 47, 773 (1975) J. Zinn-Justin, Quantum Field Theory and Critical Phenomena, 3rd edition, Clarendon Press, Oxford 1996

Problems for Chapter 7 7.1 A generalized homogeneous function fulﬁlls the relation f (λa1 x1 , λa2 x2 ) = λaf f (x1 , x2 ) . j

k

∂ Show that (a) the partial derivatives ∂ j ∂x k f (x1 , x2 ) and (b) the Fourier trans∂x1 2 R d ik1 x1 f (x1 , x2 ) of a generalized homogeneous function are form g(k1 , x2 ) = d x1 e likewise homogeneous functions.

7.2 Derive the relations (7.3.13 ) for A , K , L , and M . Include in the starting model in addition an interaction between the second-nearest neighbors L. Compute the recursion relation to leading order in K and L, i.e. up to K 2 and L. Show that (7.3.15a,b) results.

7.3 What is the value of δ for the two-dimensional decimation transformation from Sect. 7.3.3? 7.4 Show, by Fourier transformation of the susceptibility χ(q) =

1 q 2−η

χ(qξ), ˆ that

the correlation function assumes the form 1 ˆ G(|x|/ξ) . G(x) = |x|d−2+η

7.5 Conﬁrm Eq.(7.4.35). 7.6 Show that m(x) = m0 tanh

x − x0 2ξ−

is a solution of the Ginzburg–Landau equation (7.4.11). Calculate the free energy of the domain walls which it describes.

7.7 Tricritical phase transition point. A tricritical phase transition point is described by the following Ginzburg–Landau functional: Z ˘ ¯ F[φ] = dd x c(∇φ)2 + aφ2 + vφ6 − hφ with a = a τ ,

τ=

T − Tc , Tc

v≥0.

Determine the uniform stationary solution φst with the aid of the variational deriva= 0) for h = 0 and the associated tricritical exponents αt , βt , γt and δt . tive ( δF δφ

Problems for Chapter 7

405

7.8 Consider the extended Ginzburg–Landau functional Z

F[φ] =

˘ ¯ dd x c(∇φ)2 + aφ2 + uφ4 + vφ6 − hφ .

(a) Determine the critical exponents α, β, γ and δ for u > 0 in analogy to problem 7.7. They take on the same values as in the φ4 model (see Sect. 4.6); the term ∼ φ6 is irrelevant, i.e. it yields only corrections to the scaling behavior of the φ4 model. Investigate the “crossover” of the tricritical behavior for h = 0 at small u. Consider the crossover function m(x), ˜ which is deﬁned as follows: u . ˜ with φt (τ ) = φeq (u = 0, τ ) ∼ τ βt , x = p φeq (u, τ ) = φt (τ ) · m(x) 3|a|v (b) Now investigate the case u < 0, h = 0. Here, a ﬁrst-order phase transition occurs; at Tc , the absolute minimum of F changes from φ = 0 to φ = φ0 . Calculate the shift of the transition temperature Tc − T0 and the height of the jump in the order parameter φ0 . Critical exponents can also be deﬁned for the approach to the tricritical point by variation of u φ0 ∼ |u|βu ,

1

Tc − T0 ∼ |u| ψ .

Give expressions for βu and the “shift exponent” ψ. (c) Calculate the second-order phase transition lines for u < 0 and h = 0 by deriving a parameter representation from the conditions ∂3F ∂2F =0= . 2 ∂φ ∂φ3 (d) Show that the free energy in the vicinity of the tricritical point obeys a generalized scaling law “ u h ” , F[φeq ] = |τ |2−αt fˆ |τ |φt |τ |δt by inserting the crossover function found in (a) into F (φt is called the “crossover exponent”). Show that the scaling relations γ δ = 1 + , α + 2β + γ = 2 β are obeyed in (a) and at the tricritical point (problem 7.7). (e) Discuss the hysteresis behavior for a ﬁrst-order phase transition (u < 0).

7.9 In the Ginzburg–Landau approximation, the spin-spin correlation function is given by ¸ ˙ 1 1 1 X ik(x−x ) ; ξ ∝ (T − Tc )− 2 . e m(x)m(x ) = d L 2βc(ξ −2 + k2 ) |k|≤Λ

(a) Replace the sum by an integral. (b) show that in the limit ξ → ∞, the following relation holds: ¸ ˙ 1 . m(x)m(x ) ∝ |x − x |d−2 (c) Show that for d = 3 and large ξ, ¸ ˙ m(x)m(x ) = holds.

1 e−|x−x |/ξ 8πcβ |x − x |

406

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.10 Investigate the behavior of the following integral in the limit ξ → ∞: Z

Λξ

I= 0

dd q ξ 4−d , d (2π) (1 + q 2 )2

by demonstrating that: (a) I ∝ ξ 4−d , d < 4 ; (b) I ∝ ln ξ , d = 4 ; (c) I ∝ A − Bξ 4−d , d > 4 .

7.11 The phase transition of a molecular zipper from C. Kittel, American Journal of Physics 37, 917, (1969). A greatly simpliﬁed model of the helix-coil transition in polypeptides or DNA, which describes the transition between hydrogen-bond stabilized helices and a molecular coil, is the “molecular zipper”. A molecular zipper consists of N bonds which can be broken from only one direction. It requires an energy to break bond p + 1 if all the bonds 1, . . . , p are broken, but an inﬁnite energy if the preceding bonds are not all broken. A broken bond is taken to have G orientations, i.e. its state is G−fold degenerate. The zipper is open when all N − 1 bonds are broken.

(a) Determine the partition function Z=

1 − xN ; 1−x

x ≡ G exp(−β) .

(b) Determine the average number s of broken bonds. Investigate s in the vicinity of xc = 1. Which value does s assume at xc , and what is the slope there? How does s behave at x 1 and x 1? (c) What would be the partition function if the zipper could be opened from both ends?

7.12 Fluctuations in the Gaussian approximation below Tc . Expand the Ginzburg–Landau functional Z h i b F[m] = dd x am(x)2 + m(x)4 + c(∇m(x))2 − hm(x) , 2 which is O(n)-symmetrical for h = 0, up to second order in terms of the ﬂuctuations of the order parameter m (x). Below Tc , ´ ` m(x) = m1 e1 + m (x) , h = 2 a + bm21 m1 holds. (a) Show that for h → 0, the long-wavelength (k → 0) transverse ﬂuctuations mi (i = 2, . . . , n) require no “excitation energy” (Goldstone modes), and determine the Gibbs free energy. In which cases do singularities occur?

Problems for Chapter 7

407

(b) What is the expression for the speciﬁc heat ch=0 below Tc in the harmonic approximation? Compare it with the result for the disordered phase. (c) Calculate the longitudinal and transverse correlation functions relative to the spontaneous magnetization m1 ˙ ¸ G (x − x ) = m1 (x)m1 (x ) and ˙ ¸ G⊥ ij (x − x ) = mi (x)mj (x ) , i, j = 2. . . . , n for d = 3 from its Fourier transform in the harmonic approximation. Discuss in particular the limiting case h → 0.

7.13 The longitudinal correlation function below Tc . The results from problem 7.12 lead us to expect that taking into account the transverse ﬂuctuations just in a harmonic approximation will in general be insuﬃcient. Anharmonic contributions can be incorporated if we ﬁx the length of the vector m(x) (h = 0), as in the underlying Heisenberg model: m1 (x)2 +

n X

mi (x)2 = m20 = const.

i=2

Compute the Fourier transform G (k), by factorizing the four-spin correlation function in a suitable manner into two-spin correlation functions G (x − x ) =

n 1 X mi (x)2 mj (x )2 2 4m0 i,j=2

and inserting G⊥ (x − x ) =

Z

dd k eik(x−x ) . (2π)d 2βck2

Remark: for n ≥ 2 and 2 < d ≤ 4, the relations G⊥ (k) ∝ fulﬁlled exactly in the limit k → 0.

1 k2

and G ∝

1 k4−d

are

7.14 Verify the second line in Eq. (7.5.22) . 7.15 The Hubbard–Stratonovich transformation: using the identity j X ﬀ Z exp − Jij Si Sj = const. i,j

∞ −∞

“Y i

ﬀ j ” 1X −1 dmi exp − mi Jij mj , 4 i,j

show that the partition function of the Ising Hamiltonian H = written in the form Z ∞ “Y ” ´¯ ˘ ` Z = const. dmi exp H {mi } . −∞

P i,j

Jij Si Sj can be

i

Give the expansion of H in terms of mi up to the order O(m4 ). Caveat: the Ising Hamiltonian must be extended by terms with Jii so that the matrix Jij is positive deﬁnite.

408

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.16 Lattice-gas model. The partition function of a classical gas is to be mapped onto that of an Ising magnet. Method: the d-dimensional conﬁguration space is divided up into N cells. In each cell, there is at most one atom (hard core volume). One can imagine a lattice in which a cell is represented by a lattice site which is either empty or occupied (ni = 0 or 1). The attractive interaction U (xi − xj ) between two atoms is to be taken into account in the energy by the term 12 U2 (i, j)ni nj . (a) The grand partition function for this problem, after integrating out the kinetic energy, is given by ZG =

„Y N X « i=1 ni =0,1

h ` X ´i 1X exp −β −¯ µ ni + U2 (i, j)ni nj . 2 ij i

µ ¯ = kT log zv0 = µ − kT log

“ λd ” v0

,

z=

where v0 is the volume of a cell. (b) By introducing spin variables Si (ni = partition function into the form ZG =

„Y N

X

i=1 Si =−1,1

«

eβµ , λd

1 (1 2

λ= √

2π , 2πmkT

+ Si ), Si = ±1), bring the grand

h ` X X ´i exp −β E0 − hSi − Jij Si Sj . i

ij

Calculate the relations between E0 , h, J and µ, U2 , v0 . (c) Determine the vapor-pressure curve of the gas from the phase-boundary curve h = 0 of the ferromagnet. (d) Compute the particle-density correlation function for a lattice gas.

7.17 Demonstrate Eq. (7.4.63) using scaling relations. 7.18 Show that from (7.4.68) in the limit of small k and for h = 0, the longitudinal correlation function G (k) ∝

1 kd−2

follows.

7.19 Shift of Tc in the Ginzburg–Landau Theory. Start from Eq. (7.4.1) and use the so called quasi harmonic approximation in the paramagnetic phase. There the third (nonlinear) term in (7.4.1) is replaced by 6b < m(x)2 > m(x). (a) Justify this approximation. (b) Compute the transition temperature Tc and show that Tc < Tc0 . 7.20 Determine the ﬁxed points of the transformation equation (7.5.38).

8. Brownian Motion, Equations of Motion, and the Fokker–Planck Equations

The chapters which follow deal with nonequilibrium processes. First, in chapter 8, we treat the topic of the Langevin equations and the related Fokker– Planck equations. In the next chapter, the Boltzmann equation is discussed; it is fundamental for dealing with the dynamics of dilute gases and also for transport phenemona in condensed matter. In the ﬁnal chapter, we take up general problems of irreversibility and the transition to equilibrium.

8.1 Langevin Equations 8.1.1 The Free Langevin Equation 8.1.1.1 Brownian Motion A variety of situations occur in Nature in which one is not interested in the complete dynamics of a many-body system, but instead only in a subset of particular variables. The remaining variables lead through their equations of motion to relatively rapidly varying stochastic forces and to damping eﬀects. Examples are the Brownian motion of a massive particle in a liquid, the equations of motion of conserved densities, and the dynamics of the order parameter in the vicinity of a critical point. We begin by discussing the Brownian motion as a basic example of a stochastisic process. A heavy particle of mass m and velocity v is supposed to be moving in a liquid consisting of light particles. This “Brownian particle” is subject to random collisions with the molecules of the liquid (Fig. 8.1). The collisions with the molecules of the liquid give rise to an average frictional force on the massive particle, a stochastic force f (t), which ﬂuctuates around its average value as shown in Fig. 8.2. The ﬁrst contribution −mζv to this force will be characterized by a coeﬃcient of friction ζ. Under these physical conditions, the Newtonian equation of motion thus becomes the so called Langevin equation: mv˙ = −mζv + f (t).

(8.1.1)

410

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

Fig. 8.1. The Brownian motion

Fig. 8.2. Stochastic forces in Brownian motion

Such equations are referred to as stochastic equations of motion and the processes they describe as stochastic processes.1 The correlation time τc denotes the time during which the ﬂuctuations of the stochastic force remain correlated2 . From this, we assume that the average force and its autocorrelation function have the following form at diﬀering times3 f (t) = 0 f (t)f (t ) = φ(t − t ) .

(8.1.2)

Here, φ(τ ) diﬀers noticeably from zero only for τ < τc (Fig. 8.3). Since we are interested in the motion of our Brownian particle over times of order t which are considerably longer than τc , we can approximate φ(τ ) by a delta function φ(τ ) = λδ(τ ) .

(8.1.3)

The coeﬃcient λ is a measure of the strength of the mean square deviation of the stochastic force. Since friction also increases proportionally to the strength of the collisions, there must be a connection between λ and the coeﬃcient of friction ζ. In order to ﬁnd this connection, we ﬁrst solve the Langevin equation (8.1.1).

1

2

3

Due to the stochastic force in Eq. (8.1.1), the velocity is also a stochastic quantity, i.e. a random variable. Under the precondition that the collisions of the liquid molecules with the Brownian particle are completely uncorrelated, the correlation time is roughly equal −6 = to the duration of a collision. For this time, we obtain τc ≈ av¯ = 10105 cmcm /sec 10−11 sec, where a is the radius of the massive particle and v¯ the average velocity of the molecules of the medium. The mean value can be understood either as an average over independent Brownian particles or as an average over time for a single Brownian particle. In order to ﬁx the higher moments of f (t), we will later assume that f (t) follows a Gaussian distribution, Eq. (8.1.26).

8.1 Langevin Equations

411

Fig. 8.3. The correlation of the stochastic forces

8.1.1.2 The Einstein Relation The equation of motion (8.1.1) can be solved with the help of the retarded Green’s function G(t), which is deﬁned by G(t) = Θ(t)e−ζt .

G˙ + ζG = δ(t) ,

(8.1.4)

Letting v0 be the initial value of the velocity, one obtains for v(t) ∞ −ζt v(t) = v0 e + dτ G(t − τ )f (τ )/m 0 t −ζt −ζt +e dτ eζτ f (τ )/m . = v0 e

(8.1.5)

0

Since the dependence of f (τ ) is known only statistically,we do not consider the average value of v(t), but instead that of its square, v(t)2 v(t)2 = e−2ζt

t

dτ 0

t

dτ eζ(τ +τ ) φ(τ − τ )

0

1 + v02 e−2ζt ; m2

here, the cross term vanishes. With Eq. (8.1.3), we obtain v(t)2 =

λ (1 − e−2ζt ) + v02 e−2ζt 2ζm2

t ζ −1

−→

λ . 2ζm2

(8.1.6)

For t ζ −1 , the contribution of v0 becomes negligible and the memory of the initial value is lost. Hence ζ −1 plays the role of a relaxation time. We require that our particle attain thermal equilibrium after long times, t ζ −1 , i.e. that the average value of the kinetic energy obey the equipartition theorem 1 1 m v(t)2 = kT . 2 2

(8.1.7)

412

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

From this, we ﬁnd the so called Einstein relation λ = 2ζmkT .

(8.1.8)

The coeﬃcient of friction ζ is proportional to the mean square deviation λ of the stochastic force. 8.1.1.3 The Velocity Correlation Function Next, we compute the velocity correlation function:

v(t)v(t ) = e−ζ(t+t )

t

t

dτ

dτ eζ(τ +τ

0

0

)

λ δ(τ −τ )+v02 e−ζ(t+t ) . (8.1.9) m2

Since the roles of t and t are arbitrarily interchangeable, we can assume without loss of generality that t < t and immediately evaluate the two λintegrals in the order given in this equation, with the result e2ζ min(t,t ) − 1 2ζm 2 , thus obtaining ﬁnally

λ −ζ|t−t | λ 2 v(t)v(t ) = e−ζ(t+t ) . e + v0 − (8.1.10) 2ζm2 2ζm2 For t, t ζ −1 , the second term in (8.1.10) can be neglected. 8.1.1.4 The Mean Square Deviation In order to obtain the mean square displacement for t ζ −1 , we need only integrate (8.1.10) twice, t t λ −ζ|τ −τ | x(t)2 = dτ dτ e . (8.1.11) 2ζm2 0 0 Intermediate calculation for integrals of the type Z t Z t I= dτ dτ f (τ − τ ) . 0

0

We denote the parent function of f (τ ) by F (τ ) and evaluate the integral over τ , Rt I = 0 dτ (F (t − τ ) − F (−τ )). Now we substitute u = t − τ into the ﬁrst term and obtain after integrating by parts Z t Z t du (F (u) − F (−u)) = t(F (t) − F (−t)) − du u(f (u) + f (−u)) I= 0

0

and from this the ﬁnal result Z t Z t Z t dτ dτ f (τ − τ ) = du (t − u)(f (u) + f (−u)) . 0

0

0

(8.1.12)

8.1 Langevin Equations

413

With Eq. (8.1.12), it follows for (8.1.11) that t 2 λ λ 2 du (t − u)e−ζu ≈ 2 2 t x (t) = 2 2ζm ζ m 0 or 2 x (t) = 2Dt

(8.1.13)

with the diﬀusion constant D=

λ kT . = 2ζ 2 m2 ζm

(8.1.14)

It can be seen that D plays the role of a diﬀusion constant by starting from the equation of continuity for the particle density n(x) ˙ + ∇j(x) = 0

(8.1.15a)

and the current density j(x) = −D∇n(x) .

(8.1.15b)

The resulting diﬀusion equation n(x) ˙ = D∇2 n(x)

(8.1.16)

has the one-dimensional solution N x2 n(x, t) = √ e− 4Dt . 4πDt

(8.1.17)

The particle number density n(x, t) from Eq. (8.1.17) describes the spreading out of N particles which were concentrated at x = 0 at the time t = 0 (n(x, 0) = N δ(x)). That is, the mean square displacement increases with time as 2Dt. (More general solutions of (8.1.16) can be found from (8.1.17) by superposition.) We can cast the Einstein relation in a more familiar form by introducing the mobility µ into (8.1.1) in place of the coeﬃcient of friction. The Langevin equation then reads m¨ x = −µ−1 x˙ + f with µ =

1 , ζm

(8.1.18)

and the Einstein relation takes on the form D = µkT .

(8.1.19)

The diﬀusion constant is thus proportional to the mobility of the particle and to the temperature.

414

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

Remarks: (i) In a simpliﬁed version of Einstein’s4 historical derivation of (8.1.19), we treat (instead of the osmotic pressure in a force ﬁeld) the dynamic origin of the barometric pressure formula. The essential consideration is that in a gravitational ﬁeld there are two currents which must compensate each other ∂ in equilibrium. They are the diﬀusion current −D ∂z n(z) and the current of particles falling in the gravitational ﬁeld, v¯n(z). Here, n(z) is the particle number density and v¯ is the mean velocity of falling, which, due to friction, is found from µ−1 v¯ = −mg. Since the sum of these two currents must vanish, we ﬁnd the condition ∂ −D n(z) − mgµn(z) = 0 . (8.1.20) ∂z mgz

From this, the barometric pressure formula n(z) ∝ e− kT is obtained if the Einstein relation (8.1.19) is fulﬁlled. (ii) In the Brownian motion of a sphere in a liquid with the viscosity constant η, the frictional force is given by Stokes’ law, Ffr = 6πaη x, ˙ where a is the radius and x˙ the velocity of the sphere. Then the diﬀusion constant is D = kT /6πaη and the mean square displacement of the sphere is given by 2 kT t x (t) = . (8.1.21) 3πaη Using this relation, an observation of x2 (t) allows the experimental determination of the Boltzmann constant k. 8.1.2 The Langevin Equation in a Force Field As a generalization of the preceding treatment, we now consider the Brownian motion in an external force ﬁeld ∂V F (x) = − . (8.1.22a) ∂x Then the Langevin equation is given by m¨ x = −mζ x˙ + F (x) + f (t) ,

(8.1.22b)

where we assume that the collisions and frictional eﬀects of the molecules are not modiﬁed by the external force and therefore the stochastic force f (t) again obeys (8.1.2), (8.1.3), and (8.1.8).5 An important special case of (8.1.22b) is the limiting case of strong damping ζ. When the inequality mζ x˙ m¨ x is fulﬁlled (as is the case e.g for periodic motion at low frequencies), it follows from (8.1.22b) that 4 5

See the reference at the end of this chapter. We will later see that the Einstein relation (8.1.8) ensures that the function p2 exp(−( 2m +V (x))/kT ) be an equilibrium distribution for this stochastic process.

8.1 Langevin Equations

x˙ = −Γ

∂V + r(t) , ∂x

415

(8.1.23)

where the damping constant Γ and the ﬂuctuating force r(t) are given by Γ ≡

1 1 and r(t) ≡ f (t) . mζ mζ

(8.1.24)

The stochastic force r(t), according to Eqns. (8.1.2) and (8.1.3), obeys the relation r(t) = 0 r(t)r(t ) = 2Γ kT δ(t − t ) .

(8.1.25)

For the characterization of the higher moments (correlation functions) of r(t), we will further assume in the following that r(t) follows a Gaussian distribution P[r(t)] = e−

R tf t0

dt

r2 (t) 4Γ kT

.

(8.1.26)

P[r(t)] gives the probability density for the values of r(t) in the interval [t0 , tf ], where t0 and tf are the initial and ﬁnal times. To deﬁne the functional integration, we subdivide the interval into N=

t f − t0 ∆

small subintervals of width ∆ and introduce the discrete times ti = t0 + i∆ ,

i = 0, . . . , N − 1 .

The element of the functional integration D[r] is deﬁned by * ) N −1 ∆ . dr(ti ) D[r] ≡ lim ∆→0 4Γ kT π i=0

(8.1.27)

The normalization of the probability density is * ) N −1 P r2 (ti ) ∆ e− i ∆ 4Γ kT = 1 . (8.1.28) dr(ti ) D[r] P[r(t)] ≡ lim ∆→0 4Γ kT π i=0 As a check, we calculate r(ti )r(tj ) =

4Γ kT δij δij = 2Γ kT → 2Γ kT δ(ti − tj ) , 2∆ ∆

which is in agreement with Eqns. (8.1.2), (8.1.3) and (8.1.8).

416

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

Since Langevin equations of the type (8.1.23) occur in a variety of diﬀerent physical situations, we want to add some elementary explanations. We ﬁrst consider (8.1.23) without the stochastic force, i.e. x˙ = −Γ ∂V ∂x . In regions of positive (negative) slope of V (x), x will be shifted in the negative (positive) x direction. The coordinate x moves in the direction of one of the minima of V (x) (see Fig. 8.4). At the extrema of V (x), x˙ vanishes. The eﬀect of the stochastic force r(t) is that the motion towards the minima becomes ﬂuctuating, and even at its extreme positions the particle is not at rest, but instead is continually pushed away, so that the possibility exists of a transition from one minimum into another. The calculation of such transition rates is of interest for, among other applications, thermally activated hopping of impurities in solids and for chemical reactions (see Sect. 8.3.2).

Fig. 8.4. The motion resulting from the equation of motion x˙ = −Γ ∂V /∂x.

8.2 The Derivation of the Fokker–Planck Equation from the Langevin Equation Next, we wish to derive equations of motion for the probability densities in the Langevin equations (8.1.1), (8.1.22b), and (8.1.23). 8.2.1 The Fokker–Planck Equation for the Langevin Equation (8.1.1) We deﬁne P (ξ, t) = δ ξ − v(t) ,

(8.2.1)

the probability density for the event that the Brownian particle has the velocity ξ at the time t. This means that P (ξ, t)dξ is the probability that the velocity lies within the interval [ξ, ξ + dξ].

8.2 Derivation of the Fokker–Planck Equation from the Langevin Equation

417

We now derive an equation of motion for P (ξ, t): ∂ ∂ P (ξ, t) = − δ ξ − v(t) v(t) ˙ ∂t ∂ξ 1 ∂ δ ξ − v(t) −ζv(t) + f (t) =− ∂ξ m 1 ∂ δ ξ − v(t) −ζξ + f (t) =− ∂ξ m 1 ∂ ∂ ζP (ξ, t)ξ − δ ξ − v(t) f (t) , = ∂ξ m ∂ξ

(8.2.2)

where the Langevin equation (8.1.1) has been inserted in the second line. To compute the last term, we require the probability density for the stochastic force, assumed to follow a Gaussian distribution: P[f (t)] = e−

R tf t0

dt

f 2 (t) 4ζmkT

.

(8.2.3)

The averages . . . are given by the functional integral with the weight (8.2.3) (see Eq. (8.1.26)). In particular, for the last term in (8.2.2), we obtain R f (t )2 dt δ ξ − v(t) f (t) = D[f (t )] δ ξ − v(t) f (t)e− 4ζmkT δ − R f (t )2 dt 4ζmkT e = −2ζmkT D[f (t )] δ ξ − v(t) δf (t) R f (t )2 dt δ δ ξ − v(t) = 2ζmkT D[f (t )] e− 4ζmkT δf (t) δ δv(t) ∂ = 2ζmkT δ ξ − v(t) = −2ζmkT δ ξ − v(t) . δf (t) ∂ξ δf (t) (8.2.4) Here, we have to use the solution (8.1.5) ∞ f (τ ) v(t) = v0 e−ζt + dτ G(t − τ ) m 0 and take the derivative with respect to f (t). With we obtain t δv(t) 1 1 = . dτ e−ζ(t−τ ) δ(t − τ ) = δf (t) m 2m 0

(8.1.5) δf (τ ) δf (t)

= δ(τ −t) and (8.1.4), (8.2.5)

The factor 12 results from the fact that the integration interval includes only half of the δ-function. Inserting (8.2.5) into (8.2.4) and (8.2.4) into (8.2.2), we obtain the equation of motion for the probability density, the Fokker–Planck equation: ∂ ∂ kT ∂ 2 P (v, t) = ζ vP (v, t) + ζ P (v, t) . ∂t ∂v m ∂v 2

(8.2.6)

418

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

Here, we have replaced the velocity ξ by v; it is not to be confused with the stochastic variable v(t). This relation can also be written in the form of an equation of continuity

∂ ∂ kT ∂ P (v, t) = −ζ −vP (v, t) − P (v, t) . (8.2.7) ∂t ∂v m ∂v Remarks: (i) The current density, the expression in large parentheses, is composed of a drift term and a diﬀusion current. (ii) The current density vanishes if the probability density has the form mv2

P (v, t) ∝ e− 2kT . The Maxwell distribution is thus (at least one) equilibrium distribution. Here, the Einstein relation (8.1.8) plays a decisive role. Conversely, we could have obtained the Einstein relation by requiring that the Maxwell distribution be a solution of the Fokker–Planck equation. (iii) We shall see in Sect. 8.3.1 that P (v, t) becomes the Maxwell distribution in the course of time, and that the latter is therefore the only equilibrium distribution of the Fokker–Planck equation (8.2.6).

8.2.2 Derivation of the Smoluchowski Equation for the Overdamped Langevin Equation, (8.1.23) For the stochastic equation of motion (8.1.23), x˙ = −Γ

∂V + r(t), ∂x

we can also deﬁne a probability density P (ξ, t) = δ ξ − x(t) ,

(8.1.23)

(8.2.8)

where P (ξ, t)dξ is the probability of ﬁnding the particle at time t at the position ξ in the interval dξ. We now derive an equation of motion for P (ξ, t), performing the operation (F (x) ≡ − ∂V ∂x ) ∂ ∂ P (ξ, t) = − δ ξ − x(t) x(t) ˙ ∂t ∂ξ ∂ δ ξ − x(t) Γ K(x) + r(t) =− ∂ξ ∂ ∂ Γ P (ξ, t)K(ξ) − δ ξ − x(t) r(t) . =− ∂ξ ∂ξ

(8.2.9)

The overdamped Langevin equation was inserted in the second line. For the last term, we ﬁnd in analogy to Eq. (8.2.4)

8.2 Derivation of the Fokker–Planck Equation from the Langevin Equation

419

δ δ ξ − x(t) δ ξ − x(t) r(t) = 2Γ kT δr(t) δx(t) ∂ ∂ δ ξ − x(t) = −Γ kT P (ξ, t) . (8.2.10) = −2Γ kT ∂ξ δr(t) ∂ξ Here, we have integrated (8.1.23) between 0 and t, t x(t) = x(0) + dτ Γ K x(τ ) + r(τ ) ,

(8.2.11)

0

from which it follows that t

∂Γ F (x(τ )) δx(τ ) δx(t) = + δ(t − τ ) dτ . δr(t ) ∂x(τ ) δr(t ) 0

(8.2.12)

δx(τ ) The derivative is δr(t ) = 0 for τ < t due to causality and is nonzero only for τ ≥ t , with a ﬁnite value at τ = t . We thus obtain t δx(t) ∂Γ F (x(τ )) δx(τ ) = dτ + 1 for t < t (8.2.13a) δr(t ) ∂x(τ ) δr(t ) 0

and δx(t) = δr(t )

?

for t = t .

(8.2.13b)

0 for t =t

This demonstrates the last step in (8.2.10). From (8.2.10) and (8.2.9), we obtain the equation of motion for P (ξ, t), the so called Smoluchowski equation ∂ ∂ ∂2 P (ξ, t) = − Γ P (ξ, t)F (ξ) + Γ kT 2 P (ξ, t) . ∂t ∂ξ ∂ξ

(8.2.14)

Remarks: (i) One can cast the Smoluchowski equation (8.2.14) in the form of an equation of continuity ∂ ∂ P (x, t) = − j(x, t) , ∂t ∂x with the current density

∂ − K(x) P (x, t) . j(x, t) = −Γ kT ∂x

(8.2.15a)

(8.2.15b)

The current density j(x, t) is composed of a diﬀusion term and a drift term, in that order. (ii) Clearly, P (x, t) ∝ e−V (x)/kT

(8.2.16)

is a stationary solution of the Smoluchowski equation. For this solution, j(x, t) is zero.

420

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

8.2.3 The Fokker–Planck Equation for the Langevin Equation (8.1.22b) For the general Langevin equation, (8.1.22b), we deﬁne the probability density P (x, v, t) = δ(x − x(t))δ(v − v(t)) .

(8.2.17)

Here, we must distinguish carefully between the quantities x and v and the stochastic variables x(t) and v(t). The meaning of the probability density P (x, v, t) can be characterized as follows: P (x, v, t)dxdv is the probability of ﬁnding the particle in the interval [x, x + dx] with a velocity in [v, v + dv]. The equation of motion of P (x, v, t), the generalized Fokker–Planck equation ∂ ∂P F (x) ∂P ∂ kT ∂ 2 P P +v + =ζ vP + (8.2.18) ∂t ∂x m ∂v ∂v m ∂v 2 follows from a series of steps similar to those in Sect. 8.2.2; see problem 8.1.

8.3 Examples and Applications In this section, the Fokker–Planck equation for free Brownian motion will be solved exactly. In addition, we will show in general for the Smoluchowski equation that the distribution function relaxes towards the equilibrium situation. In this connection, a relation to supersymmetric quantum mechanics will also be pointed out. Furthermore, two important applications of the Langevin equations or the Fokker–Planck equations will be given: the transition rates in chemical reactions and the dynamics of critical phenomena. 8.3.1 Integration of the Fokker–Planck Equation (8.2.6) We now want to solve the Fokker–Planck equation for the free Brownian motion, (8.2.6): " # ∂ kT ∂P P˙ (v) = ζ Pv + . (8.3.1) ∂v m ∂v mv2

We expect that P (v) will relax towards the Maxwell distribution, e− 2kT , following the relaxation law e−ζt . This makes it reasonable to introduce the variable ρ = veζt in place of v. Then we have P (v, t) = P (ρe−ζt , t) ≡ Y (ρ, t) , ∂Y ζt ∂ 2 P ∂P ∂ 2 Y 2ζt = e , = e , 2 ∂v ∂ρ ∂v ∂ρ2 ∂P ∂Y ∂ρ ∂Y ∂Y ∂Y = + = ζρ + . ∂t ∂ρ ∂t ∂t ∂ρ ∂t

(8.3.2a) (8.3.2b) (8.3.2c)

8.3 Examples and Applications

421

Inserting (8.3.2a–c) into (8.2.6) or (8.3.1), we obtain kT ∂ 2 Y 2ζt ∂Y e . = ζY + ζ ∂t m ∂ρ2 This suggests the substitution Y = χeζt . Due to that

(8.3.3) ∂Y ∂t

=

∂χ ζt ∂t e

+ ζY , it follows

∂χ kT ∂ 2 χ 2ζt =ζ e . ∂t m ∂ρ2

(8.3.4)

Now we introduce a new time variable by means of dϑ = e2ζt dt ϑ=

1 2ζt e −1 , 2ζ

(8.3.5)

where ϑ(t = 0) = 0. We then ﬁnd from (8.3.4) the diﬀusion equation kT ∂ 2 χ ∂χ =ζ ∂ϑ m ∂ρ2

(8.3.6)

with its solution known from (8.1.17), (ρ−ρ0 )2 kT 1 . e− 4qϑ ; q = ζ χ(ρ, ϑ) = √ m 4πqϑ

(8.3.7)

By returning to the original variables v and t, we ﬁnd the following solution # 12 m(v−v e−ζt )2 " 0 m − P (v, t) = χeζt = e 2kT (1−e−2ζt ) (8.3.8) −2ζt 2πkT (1 − e ) of the Fokker–Planck equation (8.2.6), which describes Brownian motion in the absence of external forces. The solution of the Smoluchowski equation (8.2.14) for a harmonic potential is also contained in (8.3.8). We now discuss the most important properties and consequences of the solution (8.3.8): In the limiting case t → 0, we have lim P (v, t) = δ(v − v0 ) .

(8.3.9a)

t→0

In the limit of long times, t → ∞, the result is lim P (v, t) = e−mv

t→∞

2

/2kT

m 12 . 2πkT

(8.3.9b)

Remark: Since P (v, t) has the property (8.3.9a), we also have found the conditional probability density in (8.3.8)6 6

The conditional probability P (v, t|v0 , t0 ) gives the probability that at time t the value v occurs, under the condition that it was v0 at the time t0 .

422

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations P (v, t|v0 , t0 ) = P (v, t − t0 ) .

(8.3.10)

This is not surprising. Since, as a result of (8.1.1), (8.1.2) and (8.1.3), a Markov process7 is speciﬁed, P (v, t|v0 , t0 ) likewise obeys the Fokker–Planck equation (8.2.6).

For an arbitrary integrable and normalized initial probability density ρ(v0 ) at time t0 dv0 ρ(v0 ) = 1 (8.3.11) we ﬁnd with (8.3.8) the time dependence ρ(v, t) = dv0 P (v, t − t0 )ρ(v0 ) .

(8.3.12)

Clearly, ρ(v, t) fulﬁlls the initial condition lim ρ(v, t) = ρ(v0 ) ,

(8.3.13a)

t→t0

while for long times mv2

lim ρ(v, t) = e− 2kT

t→∞

m 12 m 12 mv2 dv0 ρ(v0 ) = e− 2kT 2πkT 2πkT

(8.3.13b)

the Maxwell distribution is obtained. Therefore, for the Fokker–Planck equation (8.2.6), and for the Smoluchowski equation with an harmonic potential, (8.2.14), we have proved that an arbitrary initial distribution relaxes towards the Maxwell distribution, (8.3.13b). The function (8.3.8) is also used, by the way, in Wilson’s exact renormalization group transformation for the continuous partial elimination of short-wavelength critical ﬂuctuations.8 8.3.2 Chemical Reactions We now wish to calculate the thermally activated transition over a barrier (Fig. 8.5). An obvious physical application is the motion of an impurity atom in a solid from one local minimum of the lattice potential into another. Certain chemical reactions can also be described on this basis. Here, x refers to the reaction coordinate, which characterizes the state of the molecule. The vicinity of the point A can, for example, refer to an excited state of a molecule, while B signiﬁes the dissociated molecule. The transition from A to B takes place via conﬁgurations which have higher energies and is made possible by the thermal energy supplied by the surrounding medium. We formulate the following calculation in the language of chemical reactions. 7

A Markov process denotes a stochastic process in which all the conditional probabilities depend only on the last time which occurs in the conditions; e.g. P (t3 , v3 |t2 , v2 ; t1 , v1 ) = P (t3 , v3 |t2 , v2 ) ,

8

where t1 ≤ t2 ≤ t3 . K. G. Wilson and J. Kogut, Phys. Rep. 12C, 75 (1974).

8.3 Examples and Applications

423

Fig. 8.5. A thermally activated transition over a barrier from the minimum A into the minimum B

We require the reaction rate (also called the transition rate), i.e. the transition probability per unit time for the conversion of type A into type B. We assume that friction is so strong that we can employ the Smoluchowski equation (8.2.15a,b), ∂ P˙ = − j(x) . ∂x

(8.2.15a)

Integration of this equation between the points α and β yields d xβ dxP = −j(xβ ) + j(xα ) , dt xα

(8.3.14)

where xβ lies between the points A and B. It then follows that j(xβ ) is the transition rate between the states (the chemical species) A and B. To calculate j(xβ ), we assume that the barrier is suﬃciently high so that the transition rate is small. Then in fact all the molecules will be in the region of the minimum A and will occupy states there according to the thermal distribution. The few molecules which have reached state B can be imagined to be ﬁltered out. The strategy of our calculation is to ﬁnd a stationary solution P (x) which has the properties 1 −V (x)/kT e Z P (x) = 0

P (x) =

in the vicinity of A

(8.3.15a)

in the vicinity of B .

(8.3.15b)

From the requirement of stationarity, it follows that

∂ ∂V ∂ kT + P , 0=Γ ∂x ∂x ∂x from which we ﬁnd by integrating once

∂V ∂ + P = −j0 . Γ kT ∂x ∂x

(8.3.16)

(8.3.17)

The integration constant j0 plays the role of the current density which, owing to the fact that (8.2.14) is source-free between A and B, is independent of x.

424

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

This integration constant can be determined from the boundary conditions given above. We make use of the following Ansatz for P (x): P (x) = e−V /kT Pˆ

(8.3.18)

in equation (8.3.17) ∂ ˆ j0 V (x)/kT e . P =− ∂x kT Γ

(8.3.17a)

Integrating this equation from A to x, we obtain x j0 ˆ P (x) = const. − dx eV (x)/kT . kT Γ A

(8.3.17 b)

The boundary condition at A, that there P follows the thermal equilibrium distribution, requires that const. =

1 . dx e−V /kT A

(8.3.19a)

Here, A means that the integral is evaluated in the vicinity of A. If the barrier is suﬃciently high, contributions from regions more distant from the minimum are negligible9 . The boundary condition at B requires ) * B j0 −VB /kT V /kT 0=e const. − , (8.3.19b) dx e kT Γ A so that j0 =

kT Γ

−1 dx e−V (x)/kT A B dx eV (x)/kT A

.

(8.3.20)

For V (x) in the vicinity of A, we set VA (x) ≈ 12 (2πν)2 x2 , and, without loss of generality, take the zero point of the energy scale at the point A. We then ﬁnd √ ∞ kT −VA /kT − 12 (2πν)2 x2 /kT dx e = dx e = √ . 2πν A −∞ Here, the integration was extended beyond the neighborhood of A out to [−∞, ∞], which is permissible owing to the rapid decrease of the integrand. The main contribution to the integral in the denominator of (8.3.20) comes 9

Inserting (8.3.17 b) with (8.3.20) into (8.3.18), one obtains from the ﬁrst term in the vicinity of pointR A just the Requilibrium distribution, while the second term x B is negligible due to Adx eV /kT / A dx eV /kT 1.

8.3 Examples and Applications

425

from the vicinity of the barrier, where we set V (x) ≈ ∆ − (2πν )2 x2 /2. Here, 2 ∆ is the height of the barrier and ν characterizes the barrier’s curvature √ ∞ B )2 x2 ∆ kT V /kT ∆/kT − (2πν 2kT kT √ dx e ≈e dx e =e . 2πν A −∞ This yields all together for the current density or the transition rate 10 j0 = 2πνν Γ e−∆/kT .

(8.3.21)

We point out some important aspects of the thermally activated transition rate: the decisive factor in this result is the Arrhenius dependence e−∆/kT , where ∆ denotes the barrier height, i.e. the activation energy. We can rewrite 2 2 2 the prefactor by making the replacements (2πν) = mω 2 , (2πν ) = mω and 1 Γ = mζ (Eq. (8.1.24)): j0 =

ωω −∆/kT e . 2πζ

(8.3.21 )

If we assume that ω ≈ ω, then the prefactor is proportional to the square of the vibration frequency characterized by the potential well.11 8.3.3 Critical Dynamics We have already pointed out in the introduction to Brownian motion that the theory developed to describe it has a considerably wider signiﬁcance. Instead of the motion of a massive particle in a ﬂuid of stochastically colliding molecules, one can consider quite diﬀerent situations in which a small number of relatively slowly varying collective variables are interacting with many strongly varying, rapid degrees of freedom. The latter lead to a damping of the collective degrees of freedom. This situation occurs in the hydrodynamic region. Here, the collective degrees of freedom represent the densities of the conserved quantities. The typical time scales for these hydrodynamic degrees of freedom increase with decreasing q proportionally to 1/q or 1/q 2 , where q is the wavenumber. In comparison, in the range of small wavenumbers all the remaining degrees of freedom are very rapid and can be regarded as stochastic noise in the equations of motion of the conserved densities. This then leads to the typical form of the hydrodynamic equations with damping terms proportional to q 2 or, in real space, ∼ ∇2 . We emphasize that “hydrodynamics” is by no means limited to the domain of liquids or gases, but instead, in an extension of its 10 11

H. A. Kramers, Physica 7, 284 (1940) ω is the frequency (attempt frequency) with which the particle arrives at the right side of the potential well, from where it has the possibility (with however a small probability ∼ e−∆/kT ) of overcoming the barrier.

426

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

original meaning, it includes the general dynamics of conserved quantities depending on the particular physical situation (dielectrics, ferromagnets, liquid crystals, etc.). A further important ﬁeld in which this type of separation of time scales occurs is the dynamics in the neighborhood of critical points. As we know from the sections on static critical phenomena, the correlations of the local order parameter become long-ranged. There is thus a ﬂuctuating order within regions whose size is of the order of the correlation length. As these correlated regions grow, the characteristic time scale also increases. Therefore, the remaining degrees of freedom of the system can be regarded as rapidly varying. In ferromagnets, the order parameter is the magnetization. In its motions, the other degrees of freedom such as those of the electrons and lattice vibrations act as rapidly varying stochastic forces. In ferromagnets, the magnetic susceptibility behaves in the vicinity of the Curie point as χ∼

1 T − Tc

(8.3.22a)

and the correlation function of the magnetization as GMM (x) ∼

e−|x|/ξ . |x|

(8.3.22b)

In the neighborhood of the critical point of the liquid-gas transition, the isothermal compressibility diverges as κT ∼

1 T − Tc

(8.3.22c)

and the density-density correlation function has the dependence gρρ (x) ∼

e−|x|/ξ . |x|

(8.3.22d)

In Eqns. (8.3.22 b,d), ξ denotes the correlation length, which behaves as −1 ξ ∼ (T − Tc ) 2 in the molecular ﬁeld approximation, cf. Sects. 5.4 and 6.5. A general model-independent approach to the theory of critical phenomena begins with a continuum description of the free energy, the Ginzburg– Landau expansion (see Sect. 7.4.1): " # a b 4 c d 2 2 F [M ] = d x (T − Tc )M + M + (∇M ) − M h , (8.3.23) 2 4 2 where e−F /kT denotes the statistical weight of a conﬁguration M (x). The most probable conﬁguration is given by δF = 0 = a (T − Tc )M − c∇2 M + bM 3 − h . δM

(8.3.24)

8.3 Examples and Applications

427

It follows from this that the magnetization and the susceptibility in the limit h → 0 are M ∼ (Tc − T )1/2 Θ(Tc − T ) and χ ∼

1 . T − Tc

Since the correlation length diverges on approaching the critical point, ξ → ∞, the ﬂuctuations also become slow. This suggests the following stochastic equation of motion for the magnetization12 M˙ (x, t) = −λ

δF + r(x, t) . δM (x, t)

(8.3.25)

The ﬁrst term in the equation of motion causes relaxation towards the minimum of the free-energy functional. This thermodynamic force becomes stronger as the gradient δF /δM (x) increases. The coeﬃcient λ characterizes the relaxation rate analogously to Γ in the Smoluchowski equation. Finally, r(x, t) is a stochastic force which is caused by the remaining degrees of freedom. Instead of a ﬁnite number of stochastic variables, we have here stochastic variables M (x, t) and r(x, t) which depend on a continuous index, the position x. Instead of M (x), we can also introduce its Fourier transform Mk = dd x e−ikx M (x) (8.3.26) and likewise for r(x, t). Then the equation of motion (8.3.25) becomes ∂F M˙ k = −λ + rk (t) . ∂M−k

(8.3.25 )

Finally, we still have to specify the properties of the stochastic forces. Their average value is zero r(x, t) = rk (t) = 0 and furthermore they are correlated spatially and temporally only over short distances, which we can represent in idealized form by rk (t)rk (t ) = 2λkT δk,−k δ(t − t )

(8.3.27)

r(x, t)r(x , t ) = 2λkT δ(x − x )δ(t − t ) .

(8.3.27 )

or

For the mean square deviations of the force, we have postulated the Einstein relation, which guarantees that an equilibrium distribution is given by 12

Also called the TDGL = time-dependent Ginzburg–Landau model.

428

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

e−βF [M] . We also assume that the probability density for the stochastic forces r(x, t) is a Gaussian distribution (cf. (8.1.26)). This has the result that the odd correlation functions for r(x, t) vanish and the even ones factor into products of (8.3.27 ) (sum over all the pairwise contractions). We will now investigate the equation of motion (8.3.25 ) for T > Tc . In what follows, we use the Gaussian approximation, i.e. we neglect the anharmonic terms; then the equation of motion simpliﬁes to M˙ k = −λ a (T − Tc ) + ck 2 Mk + rk . (8.3.28) Its solution is already familiar from the elementary theory of Brownian motion: t −γk t −γk t Mk (t) = e Mk (0) + e dt rk (t )eγk t , (8.3.29) 0

as is the resulting correlation function

Mk (t)Mk (t ) = e−γk |t−t |

λkT δk,−k + O(e−γk (t+t ) ) γk

(8.3.30)

kT e−γk |t−t | . 2 − Tc ) + ck

(8.3.31)

or, for times t, t > γk−1 , Mk (t)Mk (t ) = δk,−k

a (T

Here, we have introduced the relaxation rate γk = λ a (T − Tc ) + ck 2 .

(8.3.32a)

In particular, for k = 0 we ﬁnd γ0 ∼ (T − Tc ) ∼ ξ −2 .

(8.3.32b)

As we suspected at the beginning, the relaxation rate decreases drastically on approaching the critical point. One denotes this situation as “critical slowing down”. As we already know from Chap. 7, the interaction bM 4 between the critical ﬂuctuations leads to a modiﬁcation of the critical exponents, e.g. ξ → (T − Tc )−ν . Likewise, in the framework of dynamic renormalization group theory it is seen that these interactions lead in the dynamics to γ0 → ξ −z

(8.3.33)

with a dynamic critical exponent z 13 which diﬀers from 2. 13

See e.g. F. Schwabl and U. C. T¨ auber, Encyclopedia of Applied Physics, Vol. 13, 343 (1995), VCH.

8.3 Examples and Applications

429

Remark: According to Eq. (8.3.25), the dynamics of the order parameter are relaxational. For isotropic ferromagnets, the magnetization is conserved and the coupled precessional motion of the magnetic moments leads to spin waves. In this case, the equations of motion are given by14 δF δF ˙ (x, t) + Γ ∇2 (x, t) + r(x, t) , M(x, t) = −λM (x, t) × δM δM

(8.3.34)

with r(x, t) = 0 ,

(8.3.35)

ri (x, t)rj (x, t) = −2Γ kT ∇2δ (3) (x − x )δ(t − t )δij ,

(8.3.36)

which leads to spin diﬀusion above the Curie temperature and to spin waves below it (cf. problem 8.9). The ﬁrst term on the right-hand side of the equation of motion produces the precessional motion of the local magnetization M(x, t) around the local ﬁeld δF /δM(x, t) at the point x. The second term gives rise to the damping. Since the magnetization is conserved, it is taken to be proportional to ∇2 , i.e. in Fourier space it is proportional to k 2 . These equations of motion are known as the Bloch equations or Landau–Lifshitz equations and, without the stochastic term, have been applied in solid-state physics since long before the advent of interest in critical dynamic phenomena. The stochastic force r(x, t) is due to the remaining, rapidly ﬂuctuating degrees of freedom. The functional of the free energy is F [M(x, t)] =

1 2

b d3 x a (T − Tc )M2 (x, t) + M4 (x, t) 2

+ c(∇M(x, t))2 − hM(x, t) . (8.3.37)

∗

8.3.4 The Smoluchowski Equation and Supersymmetric Quantum Mechanics 8.3.4.1 The Eigenvalue Equation In order to bring the Smoluchowski equation (8.2.14) (V ≡ ∂V /∂x) ≡ −F

∂P ∂ ∂ =Γ kT +V P (8.3.38) ∂t ∂x ∂x into a form which contains only the second derivative with respect to x, we apply the Ansatz 14

S. Ma and G. F. Mazenko, Phys. Rev. B11, 4077 (1975).

430

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

P (x, t) = e−V (x)/2kT ρ(x, t) ,

(8.3.39)

obtaining ∂ρ = kT Γ ∂t

)

∂2 V V − + ∂x2 2kT 4(kT )2 2

* ρ.

This is a Schr¨odinger equation with an imaginary time

1 ∂2 ∂ρ 0 = − i + V (x) ρ. ∂(−i2kT Γ t) 2 ∂x2 with the potential @ 2 V 1 V 0 . − V (x) = 2 4(kT )2 2kT

(8.3.40)

(8.3.41)

(8.3.42)

Following the separation of the variables ρ(x, t) = e−2kT Γ En t ϕn (x) ,

(8.3.43)

we obtain from Eq. (8.3.40) the eigenvalue equation 1 ϕn = −En + V 0 (x) ϕn (x) . 2

(8.3.44)

Formally, equation (8.3.44) is identical with a time-independent Schr¨ odinger equation.15 In (8.3.43) and (8.3.44), we have numbered the eigenfunctions and eigenvalues which follow from (8.3.44) with the index n. The ground state of (8.3.44) is given by ϕ0 = N e− 2kT , E0 = 0 , V

(8.3.45)

where N is a normalization factor. Inserting in (8.3.39), we ﬁnd for P (x, t) the equilibrium distribution P (x, t) = N e−V (x)/kT .

(8.3.45 )

From (8.3.42), we can immediately see the connection with supersymmetric quantum mechanics. The supersymmetric partner16 to V 0 has the potential 2 V 1 V 1 . (8.3.46) + V = 2 4(kT )2 2kT 15 16

N. G. van Kampen, J. Stat. Phys. 17, 71 (1977). M. Bernstein and L. S. Brown, Phys. Rev. Lett. 52, 1933 (1984); F. Schwabl, QM I, Chap. 19, Springer 2005. The quantity Φ introduced there is connected to the ground state wavefunctions ϕ0 and the potential V as follows: Φ = −ϕ0 /ϕ0 = V /2kT .

8.3 Examples and Applications

431

Fig. 8.6. The excitation spectra of the two Hamiltonians H 0 and H 1 , from QM I, pp. 353 and 361

The excitation spectra of the two Hamiltonians H 0,1 = −

1 d2 + V 0,1 (x) 2 dx2

(8.3.47)

are related in the manner shown in Fig. 8.6. One can advantageously make use of this connection if the problem with H 1 is simpler to solve than that with H 0 . 8.3.4.2 Relaxation towards Equilibrium We can now solve the initial value problem for the Smoluchowski equation in general. Starting with an arbitrarily normalized initial distribution P (x), we can calculate ρ(x) and expand in the eigenfunctions of (8.3.44) ρ(x) = eV (x)/2kT P (x) = cn ϕn (x) , (8.3.48) n

with the expansion coeﬃcients cn = dx ϕ∗n (x)eV (x)/2kT P (x) .

(8.3.49)

From (8.3.43), we ﬁnd the time dependence e−2kT Γ En t cn ϕn (x) , ρ(x, t) =

(8.3.50)

n

from which, with (8.3.39), P (x, t) = e

−V (x)/2kT

∞ n=0

cn e−2kT Γ En t ϕn (x)

(8.3.51)

432

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

follows. The normalized ground state has the form ϕ0 =

e−V (x)/2kT dx e−V (x)/kT

1/2 .

(8.3.52)

Therefore, the expansion coeﬃcient c0 is given by c0 =

dx ϕ∗0 eV (x)/2kT P (x)

=

dx P (x)

1/2 dx e−V (x)/kT

=

1

1/2 dx e−V (x)/kT

.

(8.3.53) This allows us to cast (8.3.51) in the form P (x, t) =

∞ e−V (x)/kT −V (x)/2kT + e cn e−2kT Γ En t ϕn (x) . dx e−V (x)/kT n=1

(8.3.54)

With this, the initial-value problem for the Smoluchowski equation is solved in general. Since En > 0 for n ≥ 1, it follows from this expansion that lim P (x, t) =

t→∞

e−V (x)/kT , dx e−V (x)/kT

(8.3.55)

which means that, starting from an arbitrary initial distribution, P (x, t) develops at long times towards the equilibrium distribution (8.3.45 ) or (8.3.55).

Literature A. Einstein, Ann. d. Physik 17, 182 (1905); reprinted in Annalen der Physik 14, Supplementary Issue (2005). R. Becker, Theorie der W¨ arme, 3. Auﬂ., Springer Verlag, Heidelberg 1985, Chap. 7; R. Becker, Theory of Heat, 2nd Ed., Springer, Berlin, Heidelberg, New York 1967 H. Risken, The Fokker–Planck Equation, Springer Verlag, Heidelberg, 1984

N. G. van Kampen, Stochastic Processes in Physics and Chemistry, North Holland, Amsterdam, 1981

Problems for Chapter 8 8.1 Derive the generalized Fokker–Planck equation, (8.2.18). 8.2 A particle is moving with the step length l along the x-axis. Within each time step it hops to the right with the probability p+ and to the left with the probability p− (p+ + p− = 1). How far is it from the starting point on the average after t time steps if p+ = p− = 1/2, or if p+ = 3/4 and p− = 1/4?

Problems for Chapter 8

433

8.3 Diﬀusion and Heat Conductivity (a) Solve the diﬀusion equation n˙ = D∆n for d = 1, 2 and 3 dimensions with the initial condition n(x, t = 0) = N δ d (x) . Here, n is the particle density, N the particle number, and D is the diﬀusion constant. (b) Another form of the diﬀusion equation is the heat conduction equation ∆T =

cρ ∂T κ ∂t

where T is the temperature, κ the coeﬃcient of thermal conductivity, c the speciﬁc heat, and ρ the density. Solve the following problem as an application: potatoes are stored at +5◦ C in a broad trench which is covered with a loose layer of earth of thickness d. Right after they are covered, a cold period suddenly begins, with a steady temperature of −10◦ C, and it lasts for two months. How thick does the earth layer have to be so that the potatoes will have cooled just to 0◦ C at the end of the two months? Assume as an approximation that the same values hold for the earth and for the kg W J , c = 2000 kg·K , ρ = 1000 m potatoes: κ = 0.4 m·K 3.

8.4 Consider the Langevin equation of an overdamped harmonic oscillator x(t) ˙ = −Γ x(t) + h(t) + r(t), where h(t) is an external force and r(t) a stochastic force with the properties (8.1.25). Compute the correlation function ˙ ¸ C(t, t ) = x(t)x(t ) h=0 , the response function χ(t, t ) =

δx(t) , δh(t )

and the Fourier transform of the response function.

8.5 Damped Oscillator (a) Consider the damped harmonic oscillator m¨ x + mζ x˙ + mω02 x = f (t) with the stochastic force f (t) from Eq. (8.1.25). Calculate the correlation function and the dynamic susceptibility. Discuss in particular the position of the poles and the line shape. What changes relative to the limiting cases of the non-damped oscillator or the overdamped oscillator? (b) Express the stationary solution x(t) under the action of a periodic external t in terms of the dynamic susceptibility. Use it to compute force fe (t) = f0 cos 2π T RT the power dissipated, T1 0 dt fe (t)x(t). ˙

434

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

8.6 Diverse physical systems can be described as a subsystem capable of oscillations that is coupled to a relaxing degree of freedom, whereby both systems are in contact with a heat bath (e.g. the propagation of sound waves in a medium in which chemical reactions are taking place, or the dynamics of phonons taking energy/heat diﬀusion into account). As a simple model, consider the following system of coupled equations: 1 p m p˙ = −mω02 x − Γ p + by + R(t) b y˙ = −γy − p + r(t) . m

x˙ =

Here, x and p describe the vibrational degrees of freedom (with the eigenfrequency ω0 ), and y is the relaxational degree of freedom. The subsystems are mutually linearly coupled with their coupling strength determined by the parameter b. The coupling to the heat bath is accomplished by the stochastic forces R and r for each subsystem, with the usual properties (vanishing of the average values and the Einstein relations), and the associated damping coeﬃcients Γ and γ. (a) Calculate the dynamic susceptibility χx (ω) for the vibrational degree of freedom. (b) Discuss the expression obtained in the limiting case of γ → 0, i.e. when the relaxation time of the relaxing system is very long.

8.7 An example of an application of the overdamped Langevin equation is an electrical circuit consisting of a capacitor of capacity C and a resistor of resistance R which is at the temperature T . The voltage drop UR over the resistor depends on the current I via UR = RI, and the voltage UC over the capacitor is related to . On the average, the sum of the two voltages is the capacitor charge Q via UC = Q C zero, UR + UC = 0. In fact, the current results from the motion of many electrons, and collisions with the lattice ions and with phonons cause ﬂuctuations which are ˙ modeled by a noise term Vth in the voltage balance (J = Q) 1 RQ˙ + Q = Vth C or 1 1 Uc = Vth . U˙ c + RC RC (a) Assume the Einstein relation for the stochastic force and calculate the spectral distribution of the voltage ﬂuctuations Z ∞ φ(ω) = dt eiωt Uc (t)Uc (0) . −∞

(b) Compute ˙ 2¸ Uc ≡ Uc (t)Uc (t) ≡ and interpret the result,

Z

1 C 2

∞

dω φ(ω) −∞

˙ 2¸ Uc = 12 kT .

Problems for Chapter 8

435

8.8 In a generalization of problem 8.7, let the circuit now contain also a coil or ˙ The equation of motion inductor of self-inductance L with a voltage drop UL = LI. for the charge on the capacitor is ¨ + RQ˙ + 1 Q = Vth . Q C By again assuming the Einstein relation R ∞ for the noise voltage Vth , calculate the spectral distribution for the current −∞ dt eiωt I(t)I(0).

8.9 Starting from the equations of motion for an isotropic ferromagnet (Eq. 8.3.34), investigate the ferromagnetic phase, in which M(x, t) = eˆz M0 + δM(x, t) holds. (a) Linearize the equations of motion in δM(x, t), and determine the transverse and longitudinal excitations relative to the z-direction. (b) Calculate the dynamic susceptibility Z ∂Mi (x, t) χij (k, ω) = d3x dt e−i(kx−ωt) ∂hj (0, 0) and the correlation function Z Gij (k, ω) = d3x dt e−i(kx−ωt) δMi (x, t)δMj (0, 0) .

8.10 Solve the Smoluchowski equation ∂ ∂P (x, t) =Γ ∂t ∂x

„ « ∂ ∂V (x) kT + P (x, t) ∂x ∂x 2

x2 , by for an harmonic potential and an inverted harmonic potential V (x) = ± mω 2 solving the corresponding eigenvalue problem.

8.11 Justify the Ansatz of Eq. (8.3.39) and carry out the rearrangement to give Eq. (8.3.40). 8.12 Solve the Smoluchowski equation for the model potential V (x) = 2kT log(cosh x) using supersymmetric quantum mechanics, by transforming as in Chapter 8.3.4 to a Schr¨ odinger equation. (Literature: F. Schwabl, Quantum Mechanics, 3rd ed., Chap. 19 (Springer Verlag, Heidelberg, New York, corrected printing 2005.)

8.13 Stock-market prices as a stochastic process. Assume that the logarithm l(t) = log S(t) of the price S(t) of a stock obeys the Langevin equation (on a suﬃciently rough time scale) d l(t) = r + Γ (t) dt where r is a constant and Γ is a Gaussian “random force” with Γ (t)Γ (t ) = σ 2 δ(t − t ).

436

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

(a) Explain this approach. Hints: What does the assumption that prices in the future cannot be predicted from the price trends in the past imply? Think ﬁrst of a process which is discrete in time (e.g. the time dependence of the daily closing rates). Should the transition probability more correctly be a function of the price diﬀerence or of the price ratio? (b) Write the Fokker–Planck equation for l, and based on it, the equation for S. (c) What is the expectation value for the market price at the time t, when the stock is being traded at the price S0 at time t0 = 0? Hint: Solve the Fokker–Planck equation for l = log S.

9. The Boltzmann Equation

9.1 Introduction In the Langevin equation (Chap. 8), irreversibility was introduced phenomenologically through a damping term. Kinetic theories have the goal of explaining and quantitatively calculating transport processes and dissipative eﬀects due to scattering of the atoms (or in a solid, of the quasiparticles). The object of these theories is the single-particle distribution function, whose time development is determined by the kinetic equation. In this chapter, we will deal with a monatomic classical gas consisting of particles of mass m; we thus presume that the thermal wavelength λT = √ 2π/ 2πmkT and the volume per particle v = n−1 obey the inequality λT n−1/3 , i.e. the wavepackets are so strongly localized that the atoms can be treated classically. Further characteristic quantities which enter include the duration of a collision τc and the collision time τ (this is the mean time between two collisions of an atom; see (9.2.12)). We have τc ≈ rc /¯ v and τ ≈ 1/nrc2 v¯, where rc is the range of the potentials and v¯ is the average velocity of the particles. In order to be able to consider independent two-particle collisions, we need the additional condition τc τ , i.e. the duration of a collision is short in comparison to the collision time. This condition is fulﬁlled in the low-density limit, rc n−1/3 . Then collisions of more than two particles can be neglected. The kinetic equation which describes the case of a dilute gas considered here is the Boltzmann equation 1 . The Boltzmann equation is one of the most fundamental equations of non-equilibrium statistical mechanics and is applied in areas far beyond the case of the dilute gas2 . 1

2

Ludwig Boltzmann, Wien. Ber. 66, 275 (1872); Vorlesungen u ¨ber Gastheorie, Leipzig, 1896; Lectures on Gas Theory, translated by S. Brush, University of California Press, Berkeley, 1964 See e.g. J. M. Ziman, Principles of the Theory of solids, 2nd ed, Cambridge Univ. Press, Cambridge, 1972.

438

9. The Boltzmann Equation

In this chapter we will introduce the Boltzmann equation using the classical derivation of Boltzmann1 . Next, we discuss some fundamental questions relating to irreversibility based on the H theorem. As an application of the Boltzmann equation we then determine the hydrodynamic equations and their eigenmodes (sound, heat diﬀusion). The transport coeﬃcients are derived systematically from the linearized Boltzmann equation using its eigenmodes and eigenfrequencies.

9.2 Derivation of the Boltzmann Equation We presume that only one species of atoms is present. For these atoms, we seek the equation of motion of the single-particle distribution function. Deﬁnition: The single-particle distribution function f (x, v, t) is deﬁned by f (x, v, t) d3 x d3 v = the number of particles which are found at time t in the volume element d3 x around the point x and d3 v around the velocity v. d3 x d3 v f (x, v, t) = N . (9.2.1) The single-particle distribution function f (x, v, t) is related to the N -particle distribution function ρ(x1 , v1 , . . . , xN , vN , t) (Eq. (2.3.1)) through f (x1 , v1 , t) = N d3 x2 d3 v2 . . . d3 xN d3 vN ρ(x1 , v1 , . . . , xN , vN , t). Remarks: 1. In the kinetic theory, one usually takes the velocity as variable instead of the momentum, v = p/m. 2. The 6-dimensional space generated by x and v is called µ space. 3. The volume elements d3 x and d3 v are supposed to to be of small linear dimensions compared to the macroscopic scale or to the mean velocity v¯ = kT /m , but large compared to the microscopic scale, so that many particles are to be found within each element. In a gas under standard conditions (T = 1◦ C, P = 1 atm), the number of molecules per cm3 is n = 3 × 1019 . In a cube of edge length 10−3 cm, i.e. a volume element of the size d3 x = 10−9 cm3 , which for all experimental purposes can be considered to be pointlike, there are still 3 × 1010 molecules. If we choose d3 v ≈ 10−6 × v¯3 , then from the Maxwell distribution f 0 (v) = n

m 3/2 mv2 e− 2kT , 2πkT

in this element of µ space, there are f 0 d3 x d3 v ≈ 104 molecules. To derive the Boltzmann equation, we follow the motion of a volume element in µ space during the time interval [t, t + dt]; cf. Fig. 9.1. Since those

9.2 Derivation of the Boltzmann Equation

439

Fig. 9.1. Deformation of a volume element in µ space during the time interval dt.

particles with a higher velocity move more rapidly, the volume element is deformed in the course of time. However, the consideration of the sizes of the two parallelepipeds3 yields d3 x d3 v = d3 x d3 v .

(9.2.2)

The number of particles at the time t in d3 x d3 v is f (x, v, t) d3 x d3 v, and the number of particles in the volume element which develops after the time 1 interval dt is f (x + vdt, v + m Fdt, t + dt) d3 x d3 v . If the gas atoms were collision-free, these two numbers would be the same. A change in these particle numbers can only occur through collisions. We thus obtain 1 f (x + v dt, v + F dt, t + dt) − f (x, v, t) d3 x d3 v = m ∂f = dt d3 x d3 v , (9.2.3) ∂t coll i.e. the change in the particle number is equal to its change due to collisions. The expansion of this balance equation yields ∂ ∂f 1 + v∇x + F(x)∇v f (x, v, t) = . (9.2.4) ∂t m ∂t coll The left side of this equation is termed the ﬂow term 4 . The collision term ∂f can be represented as the diﬀerence of gain and loss processes: ∂t coll

∂f ∂t

=g−l.

(9.2.5)

coll

Here, g d3 x d3 v dt is the number of particles which are scattered during the time interval dt into the volume d3 x d3 v by collisions, and ld3 x d3 v dt is the 3

4

The result obtained here from geometric considerations can also be derived by using Liouville’s theorem (L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics, Vol. I: Mechanics, Pergamon Press, Oxford 1960, Eq. (4.6.5)). In Remark (i), p. 441, the ﬂow term is derived in a diﬀerent way.

440

9. The Boltzmann Equation

number which are scattered out, i.e. the number of collisions in the volume element d3 x in which one of the two collision partners had the velocity v before the collision. We assume here that the volume element d3 v is so small in velocity space that every collision leads out of this volume element. The following expression for the collision term is Boltzmann’s celebrated Stosszahlansatz (assumption regarding the number of collisions): ∂f ∂t

=

d3 v2 d3 v3 d3 v4 W (v, v2 ; v3 , v4 )[f (x, v3 , t)f (x, v4 , t)

coll

− f (x, v, t)f (x, v2 , t)] . (9.2.6) Here, W (v, v2 ; v3 , v4 ) refers to the transition probability v, v2 → v3 , v4 ,

Fig. 9.2. Gain and loss processes, g and l

i.e. the probability that in a collision two particles with the velocities v and v2 will have the velocities v3 and v4 afterwards. The number of collisions which lead out of the volume element considered is proportional to the number of particles with the velocity v and the number of particles with velocity v2 , and proportional to W (v, v2 ; v3 , v4 ); a sum is carried out over all values of v2 and of the ﬁnal velocities v3 and v4 . The number of collisions in which an additional particle is in the volume element d3 v after the collision is given by the number of particles with the velocities v3 and v4 whose collision yields a particle with the velocity v. Here, the transition probability W (v3 , v4 ; v, v2 ) has been expressed with the help of (9.2.8e). The Stosszahlansatz (9.2.6), together with the balance equation (9.2.4), yields the Boltzmann equation

∂ 1 + v∇x + F(x)∇v f (x, v, t) = ∂t m 3 3 d v2 d v3 d3 v4 W (v, v2 ; v3 , v4 ) f (x, v3 , t)f (x, v4 , t) − f (x, v, t)f (x, v2 , t) . (9.2.7) It is a nonlinear integro-diﬀerential equation. The transition probability W has the following symmetry properties: • Invariance under particle exchange: W (v, v2 ; v3 , v4 ) = W (v2 , v; v4 , v3 ) .

(9.2.8a)

9.2 Derivation of the Boltzmann Equation

441

• Rotational and reﬂection invariance: with an orthogonal matrix D we have W (Dv, Dv2 ; Dv3 , Dv4 ) = W (v, v2 ; v3 , v4 ) .

(9.2.8b)

This relation contains also inversion symmetry: W (−v, −v2 ; −v3 , −v4 ) = W (v, v2 ; v3 , v4 ) .

(9.2.8c)

• Time-reversal invariance: W (v, v2 ; v3 , v4 ) = W (−v3 , −v4 ; −v, −v2 ) .

(9.2.8d)

The combination of inversion and time reversal yields the relation which we have already used in (9.2.6) for ∂f : ∂t coll

W (v3 , v4 ; v, v2 ) = W (v, v2 ; v3 , v4 ) .

(9.2.8e)

From the conservation of momentum and energy, it follows that W (v1 , v2 ; v3 , v4 ) = σ(v1 , v2 ; v3 , v4 )δ (3) (p1 + p2 − p3 − p4 )

2 p2 p2 p2 p1 + 2 − 3 − 3 , (9.2.8f) ×δ 2m 2m 2m 2m as one can see explicitly from the microscopic calculation of the two-particle collision in Eq. (9.5.21). The form of the scattering cross-section σ depends on the interaction potential between the particles. For all the general, fundamental results of the Boltzmann equation, the exact form of σ is not important. As an explicit example, we calculate σ for the interaction potential of hard spheres (Eq. (9.5.15)) and for a potential which falls oﬀ algebraically (problem 9.15, Eq. (9.5.29)). To simplify the notation, in the following we shall frequently use the abbreviations f1 ≡ f (x, v1 , t) with v1 = v, f2 ≡ f (x, v2 , t),

f3 ≡ f (x, v3 , t),

and

f4 ≡ f (x, v4 , t) .

(9.2.9)

Remarks: (i) The ﬂow term in the Boltzmann equation can also be derived by setting up an equation of continuity for the ﬁctitious case of collision-free, non-interacting gas atoms. To do this, we introduce the six-dimensional velocity vector « „ F ˙ v˙ = (9.2.10) w = v = x, m and the current density wf (x, v, t). For a collision-free gas, f fulﬁlls the equation of continuity ∂f + div wf = 0 . ∂t

(9.2.11)

442

9. The Boltzmann Equation

Using Hamilton’s equations of motion, Eq. (9.2.11) takes on the form « „ ∂ 1 + v∇ x + F(x)∇v f (x, v, t) = 0 ∂t m

(9.2.11 )

of the ﬂow term in Eqns. (9.2.4) and (9.2.7). (ii) With a collision term of the form (9.2.6), the presence of correlations between two particles has been neglected. It is assumed that at each instant the number of particles with velocities v3 and v4 , or v and v2 , is uncorrelated, an assumption which is also referred to as molecular chaos. A statistical element is introduced here. As a justiﬁcation, one can say that in a gas of low density, a binary collision between two molecules which had already interacted either directly or indirectly through a common set of molecules is extremely improbable. In fact, molecules which collide come from quite diﬀerent places within the gas and previously underwent collisions with completely diﬀerent molecules, and are thus quite uncorrelated. The assumption of molecular chaos is required only for the particles before a collision. After a collision, the two particles are correlated (they move apart in such a manner that if all motions were reversed, they would again collide); however, this does not enter into the equation. It is possible to derive the Boltzmann equation approximately from the Liouville equation. To this end, one derives from the latter the equations of motion for the single-, two-, etc. -particle distribution functions. The structure of these equations, which is also called the BBGKY (Bogoliubov, Born, Green, Kirkwood, Yvon) hierarchy, is such that the equation of motion for the r-particle distribution function (r = 1, 2, . . .) contains in addition also the (r + 1)-particle distribution function5 . In particular, the equation of motion for the single-particle distribution function f (x, v, t) has the form of the left side of the Boltzmann equation. The right side however contains f2 , the twoparticle distribution function, and thus includes correlations between the particles. Only by an approximate treatment, i.e. by truncating the equation of motion for f2 itself, does one obtain an expression which is identical with the collision term of the Boltzmann equation6 . It should be mentioned that terms beyond those in the Boltzmann equation lead to phenomena which do not exhibit the usual exponential decay in their relaxation behavior, but instead show a much slower, algebraic behavior; these time dependences are called “long time tails”. Considered microscopically, they result from so called ring collisions; see the reference by J. A. McLennan at the end of this chapter. Quantitatively, these eﬀects are in reality immeasurably small; up to now, they have been observed only in computer experiments. In this sense, they have a similar fate to the deviations from exponential decay of excited quantum levels which occur in quantum mechanics. (iii) To calculate the collision time τ , we imagine a cylinder whose length is equal to the distance which a particle with thermal velocity travels in unit time, and whose basal area is equal to the total scattering cross-section. An atom with a thermal velocity passes through this cylinder in a unit time and collides with all the other atoms within the cylinder. The number of atoms within the cylinder and thus the number of collisions of an atom per second is σtot v¯n, and it follows that the mean collision time is 1 τ = . (9.2.12) σtot v¯n 5

6

The r-particle distribution function is obtained from the -particle distribution R N N! d3 xr+1 d3 vr+1 d3 xN d3 vN function by means of fr (x1 , v1 , . . . xr , vr , t) ≡ (N−r)! ρ(x1 , v1 , . . . xN , vN , t). The combinatorial factor results from the fact that it is not important which of the particles is at the µ-space positions x1 , v1 , . . .. See references at the end of this chapter, e.g. K. Huang, S. Harris.

9.3 Consequences of the Boltzmann Equation

443

The mean free path l is deﬁned as the distance which an atom typically travels between two successive collisions; it is given by l ≡ v¯τ =

1 . σtot n

(9.2.13)

(iv) Estimates of the lengths and times which play a role in setting up the Boltzmann equation: the range rc of the potential must be so short that collisions occur between only those molecules which are within the same volume element d3 x: rc dx. This inequality is obeyed for the numerical example rc ≈ 10−8 cm, cm , we obtain for the time during which the particle dx = 10−3 cm. With v¯ ≈ 105 sec is within d3 x the value τd3 x ≈ τc ≈ 19

10−8 cm cm 105 sec −3

10 cm

10−3 cm cm 105 sec

≈ 10−8 sec. The duration of a collision is

≈ 10−13 sec, the collision time τ ≈ (rc2 n¯ v )−1 ≈ (10−16 cm2 × 3 ×

× 105 cm sec−1 )−1 ≈ 3 × 10−9 sec.

9.3 Consequences of the Boltzmann Equation 9.3.1 The H-Theorem7 and Irreversibility The goal of this section is to show that the Boltzmann equation shows irreversible behavior, and the distribution function tends towards the Maxwell distribution. To do this, Boltzmann introduced the quantity H, which is related to the negative of the entropy: 7 H(x, t) = d3 v f (x, v, t) log f (x, v, t) . (9.3.1) For its time derivative, one obtains from the Boltzmann equation (9.2.7) ˙ H(x, t) = d3 v (1 + log f )f˙

1 3 = − d v (1 + log f ) v∇x + F∇v f − I (9.3.2) m = −∇x d3 v (f log f ) v − I . The second term in the large brackets in the second line is proportional to d3 v ∇v (f log f ) and vanishes, since there are no particles with inﬁnite velocities, i.e. f → 0 for v → ∞. 7

Occasionally, the rumor makes the rounds that according to Boltzmann, this should actually be called the Eta-Theorem. In fact, Boltzmann himself (1872) used E (for entropy), and only later (S. H. Burbury, 1890) was the Roman letter H adopted (D. Flamm, private communication, and S. G. Brush, Kinetic Theory, Vol. 2, p. 6, Pergamon Press, Oxford, 1966).

444

9. The Boltzmann Equation

The contribution of the collision term I = d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )(f1 f2 − f3 f4 )(1 + log f1 ) (9.3.3) is found by making use of the invariance of W with respect to the exchanges 1, 3 ↔ 2, 4 and 1, 2 ↔ 3, 4 to be 1 f1 f2 d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )(f1 f2 −f3 f4 ) log I= . (9.3.4) 4 f3 f4 The rearrangement which leads from (9.3.3) to (9.3.4) is a special case of the general identity d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )(f1 f2 − f3 f4 )ϕ1 1 d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )× = 4 × (f1 f2 − f3 f4 )(ϕ1 + ϕ2 − ϕ3 − ϕ4 ) , (9.3.5) which follows from the symmetry relations (9.2.8), and where ϕi = ϕ(x, vi , t) (problem 9.1). From the inequality (x − y) log xy ≥ 0, it follows that I ≥0.

(9.3.6)

The time derivative of H, Eq. (9.3.2), can be written in the form ˙ H(x, t) = −∇x jH (x, t) − I , where

(9.3.7)

jH =

d3 v f log f v

(9.3.8)

is the associated current density. The ﬁrst term on the right-hand side of (9.3.7) gives the change in H due to the entropy ﬂow and the second gives its change due to entropy production. Discussion: a) If no external forces are present, F(x) = 0, then the simpliﬁed situation may occur that f (x, v, t) = f (v, t) is independent of x. Since the Boltzmann equation then contains no x-dependence, f remains independent of position for all times and it follows from (9.3.7), since ∇x jH (x, t) = 0, that H˙ = −I ≤ 0 .

(9.3.9)

9.3 Consequences of the Boltzmann Equation

445

The quantity H decreases and tends towards a minimum, which is ﬁnite, since the function f log f has a lower bound, and the integral over v exists.8 At the minimum, the equals sign holds in (9.3.9). In Sect. 9.3.3, we show that at the minimum, f becomes the Maxwell distribution f 0 (v) = n

m 3/2 mv2 e− 2kT . 2πkT

(9.3.10)

b) When F(x) = 0, and we are dealing with a closed system of volume V , then 3 d x ∇x jH (x, t) = dO jH (x, t) = 0 V

O(V )

holds. The ﬂux of H through the surface of this volume vanishes if the surface is an ideal reﬂector; then for each contribution v dO there is a corresponding contribution −v dO, and it follows that d d 3 Htot ≡ d xH(x, t) = − d3 xI ≤ 0 . (9.3.11) dt dt V V Htot decreases, we have irreversibility. The fact that irreversibility follows from an equation derived from Newtonian mechanics, which itself is time-reversal invariant, was met at ﬁrst with skepticism. However, the Stosszahlansatz contains a probabilistic element, as we will demonstrate in detail following Eq. (9.3.14). As already mentioned, H is closely connected with the entropy. The calculation of H for the equilibrium f0 (v) for an ideal gas yields distribution 3/2 m − 32 . The total entropy S of the (see problem 9.3) H = n log n 2πkT ideal gas (Eq. (2.7.27)) is thus

2π −1 . (9.3.12a) S = −V kH − kN 3 log m Here, is Planck’s quantum of action. Expressed locally, the relation between the entropy per unit volume, H, and the particle number density n is

2π S(x, t) = −kH(x, t) − k 3 log − 1 n(x, t) . (9.3.12b) m 8

OneR can readily convince oneself that H(t) cannot decrease without limit. Due to d3 v f (x, v, t) < ∞, f (x, v, t) is bounded everywhere and a divergence of H(t) could come only from the range of integration v → ∞. For v → R ∞, f → 0 must Rhold and as a result, log f → −∞. Comparison of H(t) = d3 v f log f with d3 v v 2 f (x, v, t) < ∞ shows that a divergence requires |log f | > v 2 . Then, 2 however, f < e−v , and H remains ﬁnite.

446

9. The Boltzmann Equation

The associated current densities are

2π − 1 j(x, t) jS (x, t) = −kjH (x, t) − k 3 log m

(9.3.12c)

and fulﬁll ˙ S(x, t) = −∇jS (x, t) + kI .

(9.3.12d)

Therefore, kI has the meaning of the local entropy production. ∗

9.3.2 Behavior of the Boltzmann Equation under Time Reversal

In a classical time-reversal transformation T (also motion reversal), the momenta (velocities) of the particles are reversed (v → −v)9 . Consider a system which, beginning with an initial state at the positions xn (0) and the velocities vn (0), evolves for a time t, to the state {xn (t), vn (t)}, then at time t1 experiences a motion-reversal transformation {xn (t1 ), vn (t1 )} → {xn (t1 ), −vn (t1 )}; then if the system is invariant with respect to time reversal, the further motion for time t1 will lead back to the motion-reversed initial state {xn (0), −vn (0)}. The solution of the equations of motion in the second time period (t > t1 ) is xn (t) = x(2t1 − t) vn (t)

(9.3.13)

= −v(2t1 − t) .

Here, we have assumed that no external magnetic ﬁeld is present. Apart from a translation by 2t1 , the replacement t → −t, v → −v is thus made. Under this transformation, the Boltzmann equation (9.2.7) becomes

∂ 1 + v∇x + F(x)∇v f (x, −v, −t) = −I [f (x, −v, −t)] . (9.3.14) ∂t m The notation of the collision term should indicate that all distribution functions have the time-reversed arguments. The Boltzmann equation is therefore not time-reversal invariant; f (x, −v, −t) is not a solution of the Boltzmann equation, but instead of an equation which has a negative sign on its righthand side (−I [f (x, −v, −t)])). The fact that an equation which was derived from Newtonian mechanics, which is time-reversal invariant, is itself not time-reversal invariant and exhibits irreversible behavior (Eq. (9.3.11)) may initially appear surprising. Historically, it was a source of controversy. In fact, the Stosszahlansatz contains a probabilistic element which goes beyond Newtonian mechanics. Even if one assumes uncorrelated particle numbers, the numbers of particles with 9

See e.g. QM II, Sect. 11.4.1

9.3 Consequences of the Boltzmann Equation

447

the velocities v and v2 will ﬂuctuate: they will sometimes be larger and sometimes smaller than would be expected from the single-particle distribution functions f1 and f2 . The most probable value of the collisions is f1 · f2 , and the time-averaged value of this number will in fact be f1 · f2 . The Boltzmann equation thus yields the typical evolution of typical conﬁgurations of the particle distribution. Conﬁgurations with small statistical weights, in which particles go from a (superﬁcially) probable conﬁguration to a less probable one (with lower entropy) – which is possible in Newtonian mechanics – are not described by the Boltzmann equation. We will consider these questions in more detail in the next chapter (Sect. 10.7), independently of the Boltzmann equation. 9.3.3 Collision Invariants and the Local Maxwell Distribution 9.3.3.1 Conserved Quantities The following conserved densities can be calculated from the single-particle distribution function: the particle-number density is given by n(x, t) ≡ d3 v f . (9.3.15a) The momentum density, which is also equal to the product of the mass and the current density, is given by m j(x, t) ≡ m n(x, t)u(x, t) ≡ m d3 v vf . (9.3.15b) Equation (9.3.15b) also deﬁnes the average local velocity u(x, t). Finally, we deﬁne the energy density, which is composed of the kinetic energy of the local convective ﬂow at the velocity u(x, t), i.e. n(x, t)mu(x, t)2 /2, together with the average kinetic energy in the local rest system10 , n(x, t)e(x, t):

m 2 u + φ2 f . 2 (9.3.15c) 3 Here, the relative velocity φ = v − u has been introduced, and d v φf = 0, which follows from Eq. (9.3.15b), has been used. For e(x, t), the internal energy per particle in the local rest system (which is moving at the velocity u(x, t)), it follows from (9.3.15c) that m n(x, t) e(x, t) = d3 v(v − u(x, t))2 f . (9.3.15c ) 2 mu(x, t)2 + e(x, t) ≡ n(x, t) 2

10

mv 2 d v f= 2 3

d3 v

We note that for a dilute gas, the potential energy is negligible relative to the kinetic energy, so that the internal energy per particle e(x, t) = ¯ (x, t) is equal to the average kinetic energy.

448

9. The Boltzmann Equation

9.3.3.2 Collisional Invariants The collision integral I of Eq. (9.3.3) and the collision term in the Boltzmann equation vanish if the distribution function f fulﬁlls the relation f1 f2 − f3 f4 = 0

(9.3.16)

for all possible collisions (restricted by the conservation laws contained in (9.2.8f), i.e. if log f1 + log f2 = log f3 + log f4

(9.3.17)

holds. Note that all the distribution functions fi have the same x-argument. Due to conservation of momentum, energy, and particle number, each of the ﬁve so called collisional invariants χi = mvi , χ4 = v ≡

i = 1, 2, 3 mv 2

(9.3.18a)

2

(9.3.18b)

χ5 = 1

(9.3.18c)

obeys the relation (9.3.17). There are no other collisional invariants apart from these ﬁve11 . Thus the logarithm of the most general distribution function for which the collision term vanishes is a linear combination of the collisional invariants with position-dependent prefactors: m log f (x, v, t) = α(x, t) + β(x, t) u(x, t) · mv − v2 , (9.3.19) 2 or

f (x, v, t) = n(x, t)

m 2πkT (x, t)

32

exp −

m (v − u(x, t))2 . 2kT (x, t) (9.3.19 )

32 2π Here, the quantities T (x, t) = (kβ(x, t))−1 , n(x, t) = mβ(x,t) exp α(x, t)

+β(x, t)mu2 (x, t)/2 and u(x, t) represent the local temperature, the local particle-number density, and the local velocity. One refers to f (x, v, t) as the local Maxwell distribution or the local equilibrium distribution function, since it is identical locally to the Maxwell distribution, (9.3.10) or (2.6.13). If we insert (9.3.19) into the expressions (9.3.15a–c) for the conserved quantities, we can see that the quantities n(x, t), u(x, t), and T (x, t) which occur on the right-hand side of (9.3.19 ) refer to the local density, velocity, and temperature, respectively, with the last quantity related to the mean kinetic energy via 11

H. Grad, Comm. Pure Appl. Math. 2, 331 (1949).

9.3 Consequences of the Boltzmann Equation

e(x, t) =

449

3 kT (x, t) , 2

i.e. by the caloric equation of state of an ideal gas. The local equilibrium distribution function f (x, v, t) is in general not a solution of the Boltzmann equation, since for it, only the collision term but not the ﬂow term vanishes12 . The local Maxwell distribution is in general a solution of the Boltzmann equation only when the coeﬃcients are constant, i.e. in global equilibrium. Together with the results from Sect. 9.3.1, it follows that a gas with an arbitrary inhomogeneous initial distribution f (x, v, 0) will ﬁnally relax into a Maxwell distribution (9.3.10) with a constant temperature and density. Their values are determined by the initial conditions. 9.3.4 Conservation Laws With the aid of the collisional invariants, we can derive equations of continuity for the conserved quantities from the Boltzmann equation. We ﬁrst relate the conserved densities (9.3.15a–c) to the collisional invariants (9.3.18a–c). The particle-number density, the momentum density, and the energy density can be represented in the following form: n(x, t) ≡ d3 v χ5 f , (9.3.20) m ji (x, t) ≡ m n(x, t)ui (x, t) =

d3 v χi f ,

(9.3.21)

d3 v χ4 f .

(9.3.22)

and

mu(x, t)2 + e(x, t) = n(x, t) 2

Next, we want to derive the equations of motion for these quantities from the Boltzmann equation (9.2.7) by multiplying the latter by χα (v) and integrating over v. Using the general identity (9.3.7), we ﬁnd ∂ 1 d3 v χα (v) + v∇x + F(x)∇v f (x, v, t) = 0 . (9.3.23) ∂t m By inserting χ5 , χ1,2,3 , and χ4 in that order, we obtain from (9.3.23) the following three conservation laws: 12

There are special local Maxwell distributions for which the ﬂow term likewise vanishes, but they have no physical relevance. See G. E. Uhlenbeck and G. W. Ford, Lectures in Statistical Mechanics, American Mathematical Society, Providence, 1963, p. 86; S. Harris, An Introduction to the Theory of the Boltzmann Equation, Holt Rinehart and Winston, New York, 1971, p. 73; and problem 9.16.

450

9. The Boltzmann Equation

Conservation of Particle Number: ∂ n + ∇j = 0 . ∂t

(9.3.24)

Conservation of Momentum: ∂ m ji + ∇xj d3 v m vj vi f − Fi (x)n(x) = 0 . ∂t

(9.3.25)

For the third term, an integration by parts was used. If we again employ the substitution v = u − φ in (9.3.25), we obtain m

∂ ∂ ji + (m n ui uj + Pji ) = nFi , ∂t ∂xj

where we have introduced the pressure tensor Pji = Pij = m d3 v φi φj f .

(9.3.25 )

(9.3.26)

Conservation of Energy: 2

Finally, setting χ4 = mv 2 in (9.3.23), we obtain ∂ m m d3 v v 2 f +∇xi d3 v (ui +φi ) (u2 +2uj φj +φ2 )f −j·F = 0 , (9.3.27) ∂t 2 2 where an integration by parts was used for the last term. Applying (9.3.22) and (9.3.26), we obtain the equation of continuity for the energy density m ∂ m 2 n u + e + ∇i nui u2 + e + uj Pji + qi = j · F . (9.3.28) ∂t 2 2 Here, along with the internal energy density e deﬁned in (9.3.15c ), we have also introduced the heat current density m φ2 f . q = d3 v φ (9.3.29) 2 Remarks: (i) (9.3.25 ) and (9.3.28) in the absence of external forces (F = 0) take on the usual form of equations of continuity, like (9.3.24). (ii) In the momentum density, according to Eq. (9.3.25 ), the tensorial current density is composed of a convective part and the pressure tensor Pij , which gives the microscopic momentum current in relation to the coordinate system moving at the average velocity u.

9.3 Consequences of the Boltzmann Equation

451

(iii) The energy current density in Eq. (9.3.28) contains a macroscopic convection current, the work which is performed by the pressure, and the heat current q (= mean energy ﬂux in the system which is moving with the liquid). (iv) The conservation laws do not form a complete system of equations as long as the current densities are unknown. In the hydrodynamic limit, it is possible to express the current densities in terms of the conserved quantities. The conservation laws for momentum and energy can also be written as equations for u and e. To this end, we employ the rearrangement ∂ ∂ ∂ ji + ∇j (nuj ui ) = n ui + ui n + ui ∇j nuj + nuj ∇j ui ∂t ∂t ∂t

∂ + uj ∇j ui =n ∂t

(9.3.30)

using (9.3.21) and the conservation law for the particle-number density (9.3.21), which yields for (9.3.25)

∂ mn + uj ∇j ui = −∇j Pji + nFi . (9.3.31) ∂t From this, taking the hydrodynamic limit, we obtain the Navier–Stokes equations. Likewise, starting from Eq. (9.3.28), we can show that

∂ n + uj ∇j e + ∇q = −Pij ∇i uj . (9.3.32) ∂t

9.3.5 Conservation Laws and Hydrodynamic Equations for the Local Maxwell Distribution 9.3.5.1 Local Equilibrium and Hydrodynamics In this section, we want to collect and explain some concepts which play a role in nonequilibrium theory. The term local equilibrium describes the situation in which the thermodynamic quantities of the system such as density, temperature, pressure, etc. can vary spatially and with time, but in each volume element the thermodynamic relations between the values which apply locally there are obeyed. The resulting dynamics are quite generally termed hydrodynamics in condensed-matter physics, in analogy to the dynamic equations which are valid in this limit for the ﬂow of gases and liquids. The conditions for local equilibrium are ωτ 1

and kl 1 ,

(9.3.33)

452

9. The Boltzmann Equation

where ω is the frequency of the time-dependent variations and k their wavenumber, τ is the collision time and l the mean free path. The ﬁrst condition guarantees that the variations with time are suﬃciently slow that the system has time to reach equilibrium locally through collisions of its atoms. The second condition presumes that the particles move along a distance l without changing their momenta and energies. The local values of momentum and energy must therefore in fact be constant over a distance l. Beginning with an arbitrary initial distribution function f (x, v, 0), according to the Boltzmann equation, the following relaxation processes occur: the collision term causes the distribution function to approach a local Maxwell distribution within the characteristic time τ . The ﬂow term causes an equalization in space, which requires a longer time. These two approaches towards equilibrium – in velocity space and in conﬁguration space – come to an end only when global equilibrium has been reached. If the system is subject only to perturbations which vary slowly in space and time, it will be in local equilibrium after the time τ . This temporally and spatially slowly varying distribution function will diﬀer from the local Maxwellian function (9.3.19 ), which does not obey the Boltzmann equation. 9.3.5.2 Hydrodynamic Equations without Dissipation In order to obtain explicit expressions for the current densities q and Pij , these quantities must be calculated for a distribution function f (x, v, t) which at least approximately obeys the Boltzmann equation. In this section, we will employ the local Maxwell distribution as an approximation. In Sect. 9.4, the Boltzmann equation will be solved systematically in a linear approximation. Following the preceding considerations concerning the diﬀerent relaxation behavior in conﬁguration space and in velocity space, we can expect that in local equilibrium, the actual distribution function will not be very diﬀerent from the local Maxwellian distribution. If we use the latter as an approximation, we will be neglecting dissipation. Using the local Maxwell distribution, Eq. (9.3.19 ), 32

2 m m (v − u(x, t)) , (9.3.34) exp − f = n(x, t) 2πkT (x, t) 2kT (x, t) with position- and time-dependent density n, temperature T , and ﬂow velocity u, we ﬁnd from (9.3.15a), (9.3.15b), and (9.3.15c ) j = nu 3 ne = nkT 2

(9.3.35)

Pij ≡

(9.3.37)

d3 v mφi φj f = δij nkT ≡ δij P ,

(9.3.36)

9.3 Consequences of the Boltzmann Equation

453

where the local pressure P was introduced; from (9.3.37), it is given by P = nkT .

(9.3.38)

The equations (9.3.38) and (9.3.36) express the local thermal and caloric equations of state of the ideal gas. The pressure tensor Pij contains no dissipative contribution which would correspond to the viscosity of the ﬂuid, as seen from Eq. (9.3.37). The heat current density (9.3.29) vanishes (q = 0) for the local Maxwell distribution. With these results, we obtain for the equations of continuity (9.3.24), (9.3.25), and (9.3.32)

∂ n = −∇nu ∂t

∂ + u∇ u = −∇P + nF ∂t

∂ + u∇ e = −P ∇u . n ∂t

mn

(9.3.39) (9.3.40) (9.3.41)

Here, (9.3.40) is Euler’s equation, well-known in hydrodynamics13 . The equations of motion (9.3.39)–(9.3.41) together with the local thermodynamic relations (9.3.36) and (9.3.38) represent a complete system of equations for n, u, and e. 9.3.5.3 Propagation of Sound in Gases As an application, we consider the propagation of sound. In this process, the gas undergoes small oscillations of its density n, its pressure P , its internal energy e, and its temperature T around their equilibrium values and around u = 0. In the following, we shall follow the convention that thermodynamic quantities for which no position or time dependence is given are taken to have their equilibrium values, that is we insert into Eqns. (9.3.39)–(9.3.41) n(x, t) = n + δn(x, t),

P (x, t) = P + δP (x, t),

e(x, t) = e + δe(x, t),

T (x, t) = T + δT (x, t)

(9.3.42)

and expand with respect to the small deviations indicated by δ: ∂ δn = −n∇u ∂t ∂ m n u = −∇δP ∂t ∂ n δe = −P ∇u . ∂t 13

(9.3.43a) (9.3.43b) (9.3.43c)

Euler’s equation describes nondissipative ﬂuid ﬂow; see L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics, Vol. IV: Hydrodynamics, Pergamon Press, Oxford 1960, p. 4.

454

9. The Boltzmann Equation

The ﬂow velocity u(x, t) ≡ δu(x, t) is small. Insertion of Eq. (9.3.36) and (9.3.38) into (9.3.43c) leads us to 3 ∂ δT = −T ∇u , 2 ∂t which, together with (9.3.43a), yields ∂ δn 3 δT − =0. ∂t n 2 T Comparison with the entropy of an ideal gas,

5 (2πmkT )3/2 S = kN + log , 2 nh3

(9.3.44)

(9.3.45)

shows that the time independence of S/N (i.e. of the entropy per particle or per unit mass) follows from (9.3.44). By applying ∂/∂t to (9.3.43a) and ∇ to (9.3.43b) and eliminating the term containing u, we obtain ∂ 2 δn = m−1 ∇2 δP . ∂t2

(9.3.46)

It follows from Eq. (9.3.38) that δP = nkδT + δnkT , ∂ ∂ and, together with (9.3.44), we obtain ∂t δP = 53 kT ∂t δn. With this, the equation of motion (9.3.46) can be brought into the form

∂ 2 δP 5kT 2 ∇ δP . = ∂t2 3m

(9.3.47)

The sound waves (pressure waves) which are described by the wave equation (9.3.47) have the form δP ∝ ei(kx±cs|k|t) with the adiabatic sound velocity 1 5kT . = cs = mnκS 3m

(9.3.48)

(9.3.49)

Here, κS is the adiabatic compressibility (Eq. (3.2.3b)), which according to Eq. (3.2.28) is given by κS =

3V 3 = 5P 5N kT

for an ideal gas.

(9.3.50)

∗

9.4 The Linearized Boltzmann Equation

455

Notes: The result that the entropy per particle S/N or the entropy per unit mass s for a sound wave is time-independent remains valid not only for an ideal gas but in general. If one takes the second derivative with respect to time of the following thermodynamic relation which is valid for local equilibrium14

∂n ∂n S , (9.3.51) δn = δP + δ ∂P S/N ∂S/N P N obtaining

∂ 2 δn ∂t2

=

∂n

∂2 P ∂P S/N ∂t2

+

∂n ∂S/N

∂ 2 S/N , then one obtains togther 2 P < ∂t => ? =0

with (9.3.43a) and (9.3.43b) the result

∂ 2 P (x, t) ∂P −1 = m ∇2 P (x, t) , ∂t2 ∂n S/N which again contains the adiabatic sound velocity

∂P ∂P c2s = m−1 = m−1 ∂n S/N ∂N/V S

∂P 1 = m−1 N −1 (−V 2 ) = . ∂V S m nκs

(9.3.52)

(9.3.53)

Following the third equals sign, the particle number N was taken to be ﬁxed. For local Maxwell distributions, the collision term vanishes; there is no damping. Between the regions of diﬀerent local equilibria, reversible oscillation processes take place. Deviations of the actual local equilibrium distribution functions f (x, v, t) from the local Maxwell distribution f l (x, v, t) lead as a result of the collision term to local, irreversible relaxation eﬀects and, together with the ﬂow term, to diﬀusion-like equalization processes which ﬁnally result in global equilibrium.

∗

9.4 The Linearized Boltzmann Equation

9.4.1 Linearization In this section, we want to investigate systematically the solutions of the Boltzmann equation in the limit of small deviations from equilibrium. The Boltzmann equation can be linearized and from its linearized form, the hydrodynamic equations can be derived. These are equations of motion for the conserved quantities, whose region of validity is at long wavelengths and 14

Within time and space derivatives, δn(x, t), etc. can be replaced by n(x, t) etc.

456

9. The Boltzmann Equation

low frequencies. It will occasionally be expedient to use the variables (k, ω) (wavenumber and frequency) instead of (x, t). We will also take an external potential, which vanishes for early times, into account: lim V (x, t) = 0 .

(9.4.1)

t→−∞

Then the distribution function is presumed to have the property lim f (x, v, t) = f 0 (v) ≡ n

t→−∞

m 32 mv2 e− 2kT , 2πkT

(9.4.2)

where f 0 is the global spatially uniform Maxwellian equilibrium distribution15 . For small deviations from global equilibrium, we can write f (x, v, t) in the form

1 0 f (x, v, t) = f (v) 1 + ν(x, v, t) ≡ f 0 + δf (9.4.3) kT and linearize the Boltzmann equation in δf or ν. The linearization of the collision term (9.2.6) yields

= − d3 v2 d3 v3 d3 v4 W (f10 f20 −f30 f40 +f10 δf2 +f20 δf1 −f30 δf4 −f40 δf3 ) coll 1 d3 v2 d3 v3 d3 v4 W (v v1 ; v3 v4 )f 0 (v1 )f 0 (v2 )(ν1 +ν2 −ν3 −ν4 ) , =− kT (9.4.4)

∂f ∂t

since f30 f40 = f10 f20 owing to energy conservation, which is contained in W (v v1 ; v3 v4 ). We also use the notation v1 ≡ v, f10 = f 0 (v) etc. The ﬂow term has the form

∂ 1 f0 0 + v∇x + F(x, t)∇v ν f + ∂t m kT f 0 (v) ∂ + v∇x ν(x, v, t) + v · ∇V (x, t) f 0 (v)/kT . (9.4.5) = kT ∂t All together, the linearized Boltzmann equation is given by: ∂ + v∇x ν(x, v, t) + v(∇V (x, t)) = −Lν ∂t

15

(9.4.6)

We write here the index which denotes an equilibrium distribution as an upper index, since later the notation fi0 ≡ f 0 (vi ) will also be employed.

∗

9.4 The Linearized Boltzmann Equation

457

with the linear collision operator L: kT d3 v2 d3 v3 d3 v4 W (v, v2 ; v3 , v4 )(ν + ν2 − ν3 − ν4 ) (9.4.7) Lν = 0 f (v) and W (v v2 ; v3 v4 ) =

1 1 0 f (v)f 0 (v2 )f 0 (v3 )f 0 (v4 )) 2 W (v v2 ; v3 v4 , (9.4.8) kT

where conservation of energy, contained in W , has been utilized. 9.4.2 The Scalar Product For our subsequent investigations, we introduce the scalar product of two functions ψ(v) and χ(v), f 0 (v) ψ|χ = d3 v ψ(v) χ(v) ; (9.4.9) kT it possesses the usual properties. The collisional invariants are special cases:

5 5 χ |χ ≡ 1|1 =

n f 0 (v) = , kT kT 4 5 ne 3 mv 2 f 0 (v) χ |χ ≡ |1 = d3 v = = n 2 kT kT 2

with ≡

mv 2 2

d3 v

(9.4.10a) (9.4.10b)

and

4 4 χ |χ ≡ | =

d3 v

mv 2 2

2

15 f 0 (v) = nkT . kT 4

(9.4.10c)

The collision operator L introduced in (9.4.7) is a linear operator, and obeys the relation 1 χ|Lν = d3 v1 d3 v2 d3 v3 d3 v4 W (v1 v2 ; v3 v4 ) 4 × (ν1 + ν2 − ν3 − ν4 )(χ1 + χ2 − χ3 − χ4 ) . (9.4.11) It follows from this that L is self-adjoint and positive semideﬁnite, χ|Lν = Lχ|ν , ν|Lν ≥ 0 .

(9.4.12) (9.4.13)

458

9. The Boltzmann Equation

9.4.3 Eigenfunctions of L and the Expansion of the Solutions of the Boltzmann Equation The eigenfunctions of L are denoted as χλ Lχλ = ωλ χλ ,

ωλ ≥ 0 .

(9.4.14)

The collisional invariants χ1 , χ2 , χ3 , χ4 , χ5 are eigenfunctions belonging to the eigenvalue 0. It will prove expedient to use orthonormalized eigenfunctions: χ ˆλ |χ ˆλ = δ λλ . (9.4.15) For the collisional invariants, this means the introduction of vi vi ˆui = χ ˆi ≡ χ = , i = 1, 2, 3 ; (9.4.16a) vi |vi n/m 1 d3 v v2 f 0 (v)/kT (here not summed over i) ; vi |vi = 3 1 1 ˆn = = ; and (9.4.16b) χ ˆ5 ≡ χ 1|1 n/kT − 3 kT 1|1 − 1 1| = 9 2 ˆT = 9 . χ ˆ4 ≡ χ 3 1|1 (1|1 | − 1|2 ) 2 nkT

(9.4.16c)

The eigenfunctions χλ with ωλ > 0 are orthogonal to the functions (9.4.16a– c) and in the case of degeneracy are orthonormalized among themselves. An arbitrary solution of the linearized Boltzmann equation can be represented as a superposition of the eigenfunctions of L with position- and time-dependent prefactors16 ν(x, v, t) = a5 (x, t)χ ˆn + a4 (x, t)χ ˆT + ai (x, t)χ ˆui +

∞

aλ (x, t)χ ˆλ . (9.4.17)

λ=6

Here, the notation indicates the particle-number density n(x, t), the temperature T (x, t), and the ﬂow velocity ui (x, t): f0 ν χ ˆT ≡ d3 v δf (x, v, t)χ ˆT kT δe − 32 kT δn 3n δT (x, t) . (9.4.18a) = 9 = 2kT 3 nkT

T Tˆ (x, t) ≡ a4 (x, t) = χ ˆ |ν =

d3 v

2

16

Here we assume that the eigenfunctions χλ form a complete basis. For the explicitly known eigenfunctions of the Maxwell potential (repulsive r −4 potential), this can be shown directly. For repulsive r −n potentials, completeness was proved by Y. Pao, Comm. Pure Appl. Math. 27, 407 (1974).

∗

9.4 The Linearized Boltzmann Equation

459

The identiﬁcation of δT (x, t) with local ﬂuctuations of the temperature, apart from the normalization factor, can be justiﬁed by considering the local internal energy e + δe =

3 (n + δn)k(T + δT ) , 2

from which, neglecting second-order quantities, it follows that δe =

3 3 nkδT + kT δn 2 2

⇒

δT =

δe − 32 δnkT . 3 2 nk

(9.4.19)

Similarly, we obtain for d3 v δf (x, v, t)

ˆn |ν = n ˆ (x, t) ≡ a5 (x, t) = χ

1

δn = , n/kT n/kT (9.4.18b)

and

vi d3 v δf (x, v, t) n/m vi nui (x, t) = d3 v (f 0 + δf ) = , i = 1, 2, 3 . n/m n/m

ˆui |ν = u ˆi (x, t) ≡ ai (x, t) = χ

(9.4.18c)

These expressions show the relations to the density and momentum ﬂuctuations. We now insert the expansion (9.4.17) into the linearized Boltzmann equation (9.4.6)

∞ ∂ + v∇ ν(x, v, t) = − aλ (x, t)ωλ χ ˆλ (v) − v∇V (x, t) . (9.4.20) ∂t λ =6

Only terms with λ ≥ 6 contribute to the sum, since the collisional invariants have the eigenvalue 0. Multiplying this equation by χˆλ f 0 (v)/kT and integrating over v, we obtain, using the orthonormalization of χ ˆλ from Eq. (9.4.15), ∞ ∂ λ a (x, t) + ∇ χ ˆλ |vχ ˆλ aλ (x, t) ∂t λ =1

λ ˆ |v ∇V (x, t) . (9.4.21) = −ωλ aλ (x, t) − χ

Fourier transformation d3 k dω i(k·x−ωt) λ λ e a (k, ω) a (x, t) = (2π)3 2π yields

(9.4.22)

460

9. The Boltzmann Equation ∞

(ω + iωλ )aλ (k, ω) − k

χ ˆλ |vχ ˆλ

λ aλ (k, ω) − k χ ˆ |v V (k, ω) = 0 .

λ =1

(9.4.23) Which quantities couple to each other depends on the scalar products λ λ χ ˆ |vχ ˆλ clearly plays a role. ˆ , whereby the symmetry of the χ Since ωλ = 0 for the modes λ = 1 to 5, i.e. momentum, energy, and particle-number density, the structure of the conservation laws for these quantities in (9.4.23) can already be recognized at this stage. The term containing the external force obviously couples only to χ ˆi ≡ χ ˆui for reasons of symmetry i j i j v |v (9.4.24) = n/m δ ij . χ ˆ |v = n/m For the modes with λ ≤ 5, ωaλ (k, ω) − k

∞

χ ˆλ |vχ ˆλ

λ aλ (k, ω) − k χ ˆ |v V (k, ω) = 0

(9.4.25)

λ =1

holds, and for the non-conserved degrees of freedom17 λ ≥ 6, we have ki a (k, ω) = ω + iωλ λ

+

5

∞

χ ˆλ |vi χ ˆλ

λ =1

χ ˆλ |vi χ ˆλ

aλ (k, ω)

λ aλ (k, ω) + χ ˆ |vi V (k, ω) . (9.4.26)

λ =6

This diﬀerence, which results from the diﬀerent time scales, forms the basis for the elimination of the non-conserved degrees of freedom. 9.4.4 The Hydrodynamic Limit For low frequencies (ω ω λ ) and (vk ω λ ), aλ (k, ω) with λ ≥ 6 is of higher order in these quantities than are the conserved quantities λ = 1, . . . , 5. Therefore, in leading order we can write for (9.4.26) ) 5 * λ iki λ λ λ λ a (k, ω) = − χ ˆ |vi χ a (k, ω) + χ ˆ ˆ |vi V (k, ω) . (9.4.27) ωλ λ =1

Inserting this into (9.4.25) for the conserved (also called the hydrodynamic) variables, we ﬁnd 17

Here, the Einstein summation convention is employed: repeated indices i, j, l, r are to be summed over from 1 to 3.

∗

5

ωaλ (k, ω) − ki

9.4 The Linearized Boltzmann Equation

χ ˆλ |vi χ ˆλ

461

aλ (k, ω)

λ =1 5 ∞ λ 1 µ λ χ ˆ |vj χ + iki kj χ ˆ |vi χ ˆ |vi V (k, ω) ˆµ ˆλ aλ (k, ω) − ki χ ω µ λ =1 µ=6 ∞ −ik j χ ˆλ |vj V (k, ω) = 0 ; (9.4.28) χ ˆλ |vi χ ˆλ − ki ω λ λ =6

this is a closed system of hydrodynamic equations of motion. The second term in these equations leads to motions which propagate like sound waves, the third term to damping of these oscillations. The latter results formally from the elimination of the inﬁnite number of non-conserved variables which was possible due to the separation of the time scales of the hydrodynamic variables (typical frequency ck, Dk 2 ) from the that of the non-conserved variables (typical frequency ωµ ∝ τ −1 ). The structure which is visible in Eq. (9.4.28) is of a very general nature and can be derived from the Boltzmann equations for other physical systems, such as phonons and electrons or magnons in solids. Now we want to further evaluate Eq. (9.4.28) for a dilute gas without the eﬀect of an external potential. We ﬁrst compute the scalar products in the second term (see Eqns. (9.4.16a–c)) 0 n kT vi vj j 3 f (v) χ ˆ |vi χ (9.4.29a) ˆ = d v = δij 2 kT m n /kT m 2 mv 3 0 T 2 − 2 kT 2kT j 3 f (v) vi vj 9 . (9.4.29b) χ ˆ |vi χ ˆ = d v = δij kT 3m n 3 nkT m2

j n,T These scalar products and χ ˆ |vi χ ˆ |vi χ ˆn,T = χ ˆj are the only ﬁnite scalar products which result from the ﬂow term in the equation of motion. We now proceed to analyze the equations of motion for the particlenumber density, the energy density, and the velocity. In the equation of motion for the particle-number density, λ ≡ 5 (9.4.28), there is a coupling to ai (k, ω) due to the second term. As noted above, all the other scalar products vanish. The third term vanishes completely, since χ ˆn |vi χ ˆµ ∝ vi |χ ˆµ = 0 for µ ≥ 6 owing to the orthonormalization. We thus ﬁnd kT i ωˆ n(k, ω) − ki u ˆ (k, ω) = 0 , (9.4.30) m or, due to (9.4.18), ωδn(k, ω) − ki nui (k, ω) = 0 , or in real space

(9.4.30 )

462

9. The Boltzmann Equation

∂ n(x, t) + ∇nu(x, t) = 0 . ∂t

(9.4.30 )

This equation of motion is identical with the equation of continuity for the density, (9.3.24), except that here, n(x, t) in the gradient term is replaced by n because of the linearization. The equation of motion for the local temperature, making use of (9.4.28), (9.4.18a), and (9.4.29b), can be cast in the form ω

3n kδT (k, ω) − ki 2kT

2kT nui (k, ω) 3m n/m 5 ∞ 1 µ + iki kj χ ˆ |vj χ χ ˆ4 |vi χ ˆµ ˆλ aλ (k, ω) = 0 . (9.4.31) ωµ µ=6 λ =1

µ In the sum over λ , the term λ = 5 makes no contribution, since χ ˆ |vj χ ˆ5 ∝ χ ˆµ |vj = 0. Due to the fact that χ ˆ4 transforms as a scalar, χ ˆµ must transform λ i ˆ =χ ˆ also makes no contribution, like vi , so that due to the second factor, χ leaving only χ ˆλ = χ ˆ4 . Finally, only the following expression remains from the third term of Eq. (9.4.31): iki kj

∞ 4 1 µ χ ˆ |vi χ χ ˆ |vj χ ˆµ ˆ4 a4 (k, ω) ωµ µ=6

≈ iki kj τ

∞ 4 µ χ ˆ |vi χ ˆ |vj χ ˆµ χ ˆ4 a4 (k, ω) µ=6

5 4 λ χ ˆ |vi χ ˆ |vj χ ˆ4 |vi vj χ ˆ4 − ˆλ χ ˆ4 a4 (k, ω) = iki kj τ χ

= iki kj τ

λ=1

4 i ˆ |vi χ ˆ |vj χ χ ˆ4 |vi vj χ ˆ4 − χ ˆi χ ˆ4 a4 (k, ω) . (9.4.32)

In this expression, all the ωµ−1 were replaced by the collision time, ωµ−1 = τ , and we have employed the completeness relation for the eigenfunctions of L as well as the symmetry properties. We now have 4 2kT i χ ˆ |vi χ , (9.4.33a) ˆ = 3m where here, we do not sum over i, and 4 1 χ ˆ |vi vj χ ˆ4 = δij 3

d3 v f 0 (v) v2

mv2 2

2

−

mv2 2 3kT 3 2 2 n(kT )

+

= δij

3

2 kT

2

7kT . (9.4.33b) 3m

∗

9.4 The Linearized Boltzmann Equation

463

Thus the third term in Eq. (9.4.31) becomes ik 2 D 3n/2kT kδT , with the coeﬃcient D≡

5 kT τ κ = , 3 m mcv

(9.4.34)

3 nk 2

(9.4.35)

where cv =

is the speciﬁc heat at constant volume, and κ=

5 2 nk T τ 2

(9.4.36)

refers to the heat conductivity. All together, using (9.4.32)–(9.4.34), we obtain for the equation of motion (9.4.31) of the local temperature 2kT i 4 ωa (k, ω) − ki a (k, ω) + ik 2 Da4 (k, ω) = 0 , (9.4.37) 3m or ωδT −

2T k · nu + ik 2 DδT = 0 , 3n

(9.4.37 )

or in real space, ∂ 2T T (x, t) + ∇nu(x, t) − D∇2 T (x, t) = 0 . ∂t 3n

(9.4.37 )

Connection with phenomenological considerations: The time variation of the quantity of heat δQ is δ Q˙ = −∇jQ

(9.4.38a)

with the heat current density jQ . In local equilibrium, the thermodynamic relation δQ = cP δT

(9.4.38b)

holds. Here, the speciﬁc heat at constant pressure appears, because heat diﬀusion is isobaric owing to cs k Ds k2 in the limit of small wavenumbers with the velocity of sound cs and the thermal diﬀusion constant Ds . The heat current ﬂows in the direction of decreasing temperature, which implies jQ = −

κ ∇T m

(9.4.38c)

with the thermal conductivity κ. Overall, we thus obtain κ d T = ∇2 T , dt mcP a diﬀusion equation for the temperature.

(9.4.38d)

464

9. The Boltzmann Equation

Finally, we determine the equation of motion of the momentum density, i.e. for aj , j = 1, 2, 3. For the reversible terms (the ﬁrst and second terms

ˆj = 0 the in Eq. (9.4.28)), we ﬁnd by employing (9.4.18b–c) and χ ˆj |vi χ result j j ωaj (k, ω) − ki χ ˆ |vi χ ˆ5 a5 (k, ω) + χ ˆ |vi χ ˆ4 a4 (k, ω)

m kT n ωnuj (k, ω) − kj δn(k, ω) − kj kδT (k, ω) = (9.4.39) n m m

m 1 ωnuj (k, ω) − kj δP (k, ω) , = n m where, from P (x, t) = n(x, t)kT (x, t) = n + δn(x, t) k T + δT (x, t) , it follows that δP = nkδT + kT δn , which was used above. For the damping term in the equation of motion of the momentum density, we obtain from (9.4.28) using the approximation ωµ = 1/τ the result: 1 µ χ ˆ |vl χ ˆλ aλ (k, ω) ωµ λ =1 µ=6 ∞ vr vj µ 1 µ χ ˆ vl ˆ vi χ ar (k, ω) = iki kl ω n/m n/m µ µ=6

v v j vi vl r ≈ iki kl τ − n/m n/m 5 vr vj λ χ ˆλ vl ˆ vi χ ar (k, ω) . n/m n/m λ=1

iki kl

5 ∞

χ ˆj |vi χ ˆµ

(9.4.40)

In the second line, we have used the fact that the sum over λ reduces to r = 1, 2, 3. For the ﬁrst term in the curved brackets we obtain: v v m j vi vl r d3 v f 0 (v)vj vi vl vr = nkT n/m n/m kT (δji δlr + δjl δir + δjr δil ) . = m For the second term in the curved brackets in (9.4.40), we need the results of problem 9.12, leading to δij δlr 5kT 3m . As a result, the overall damping term (9.4.40) is given by

∗

9.4 The Linearized Boltzmann Equation

465

5 ∞ j 1 µ χ ˆ |vl χ χ ˆ |vi χ ˆµ ˆλ aλ (k, ω) ωµ λ =1 µ=6

kT 5 δji δlr + δjl δir + δjr δil − δij δlr ar (k, ω) = iki kl τ m 3 (9.4.40 )

n 2 + ki kj ui (k, ω) + ki ki uj (k, ω) τ kT = i kj kl ul (k, ω) − 3 m

n 1 2 kj k · u(k, ω) + k uj (k, ω) τ kT . =i 3 m

iki kl

Deﬁning the shear viscosity as η ≡ nτ kT ,

(9.4.41)

we ﬁnd with (9.4.39) and (9.4.40 ) the following equivalent forms of the equation of motion for the momentum density: ωnuj (k, ω) −

1 η kj δP (k, ω) + i m m

1 kj ku(k, ω) + k2 uj (k, ω) = 0 , 3 (9.4.42)

or, in terms of space and time,

1 ∂ 2 mnuj (x, t)+∇j P (x, t)−η ∇j ∇ · u(x, t) + ∇ uj (x, t) = 0 (9.4.42 ) ∂t 3 or ∂ mnuj (x, t) + Pjk,k (x, t) = 0 ∂t

(9.4.42 )

with the pressure tensor (Pjk,k ≡ ∇k Pjk , etc.)

2 Pjk (x, t) = δjk P (x, t)−η uj,k (x, t) + uk,j (x, t) − δjk ul,l (x, t) . (9.4.43) 3 We can compare this result with the general pressure tensor of hydrodynamics:

2 Pjk (x, t) = δjk P (x, t) − η uj,k (x, t) + uk,j (x, t) − δjk ul,l (x, t) − 3 − ζδjk ul,l (x, t) . (9.4.44) Here, ζ is the bulk viscosity, also called the compressional viscosity. As a result of Eq. (9.4.44), the bulk viscosity vanishes according to the Boltzmann equation for simple monatomic gases. The expression (9.4.41) for the viscosity can also be written in the following form (see Eqns. (9.2.12) and (9.2.13)):

466

9. The Boltzmann Equation

η = τ nkT = τ n

2 1 mvth mvth = nmvth l = , 3 3 3σtot

(9.4.45)

where vth = 3kT /m is the thermal velocity from the Maxwell distribution; i.e. the viscosity is independent of the density. It is instructive to write the hydrodynamic equations in terms of the normalized functions n ˆ = √ n2 , etc. instead of the usual quantities n(x, t), n /kT

T (x, t), ui (x, t). From Eqns. (9.4.30), (9.4.37), and (9.4.42) it follows that n ˆ˙ (x, t) = −cn ∇i u ˆi (x, t) (9.4.46a) ˙ˆ 2ˆ i T (x, t) = −cT ∇i u ˆ (x, t) + D∇ T (x, t) (9.4.46b) η η ˆ) ∇2 u ∇i (∇ · u ˆ − cT ∇i Tˆ + ˆi + (9.4.46c) u ˆ˙ i (x, t) = −cn ∇i n mn 3mn kT /m, cT = 2kT /3m, D and η from with the coeﬃcients cn = Eqns. (9.4.34) and (9.4.41). Note that with the orthonormalized quantities, the coupling of the degrees of freedom in the equations of motion is symmetric. 9.4.5 Solutions of the Hydrodynamic Equations The periodic solutions of (9.4.46a–c), which can be found using the ansatz n ˆ (x, t) ∝ u ˆi (x, t) ∝ Tˆ (x, t) ∝ ei(kx−ωt) , are particularly interesting. The acoustic resonances which follow from the resulting secular determinant and the thermal diﬀusion modes have the frequencies i ω = ±cs k − Ds k 2 2 ω = −iDT k 2

(9.4.47a) (9.4.47b)

with the sound velocity cs , the acoustic attenuation constant Ds , and the heat diﬀusion constant (thermal diﬀusivity) DT 9 5 kT 1 2 ≡√ cs = c2n + cT = (9.4.48a) 3 m mnκs

κ 1 4η 1 + (9.4.48b) − Ds = 3mn mn cv cP cv κ DT = D = . (9.4.48c) cP mcP In this case, the speciﬁc heat at constant pressure enters; for an ideal gas, it is given by cP =

5 nk . 2

(9.4.49)

∗

9.4 The Linearized Boltzmann Equation

467

The two transverse components of the momentum density undergo a purely diﬀusive shearing motion: Dη =

ηk 2 . mn

(9.4.50)

The resonances (9.4.47a,b) express themselves for example in the densitydensity correlation function, Snn (k, ω). The calculation of dynamic susceptibilities and correlation functions (problem 9.11) starting from equations of motion with damping terms is described in QM II, Sect. 4.7. The coupled system of hydrodynamic equations of motion for the density, the temperature, and the longitudinal momentum density yields the density-density correlation function:

∂n Snn (k, ω) = 2kT n ∂P T ⎧ ⎫ cv 2⎬ ⎨ ccv (cs k)2 Ds k 2 + 1 − ccv (ω 2 − c2s k 2 )DT k 2 D 1 − k T cP P P × + 2 . ⎩ (ω 2 − c2s k 2 )2 + (ωDs k 2 )2 ω + (DT k 2 )2 ⎭ (9.4.51) The density-density correlation function for ﬁxed k is shown schematically as a function of ω in Fig. 9.3.

Fig. 9.3. The densitydensity correlation function for ﬁxed k as a function of ω

The positions of the resonances are determined by the real parts and their widths by the imaginary parts of the frequencies (9.4.47a, b). In addition to the two resonances representing longitudinal acoustic phonons at ±cs k, one ﬁnds a resonance at ω = 0 related to heat diﬀusion. The area below the curve shown in Fig. 9.3, which determines the overall intensity in inelastic scattering ∂n experiments, is proportional to the isothermal compressibility ∂P . The T relative strength of the diﬀusion compared to the two acoustic resonances V is given by the ratio of the speciﬁc heats, cPc−c . This ratio is also called V the Landau–Placzek ratio, and the diﬀusive resonance in Snn (k, ω) is the Landau–Placzek peak.

468

9. The Boltzmann Equation

Since the speciﬁc heat at constant pressure diverges as (T − Tc )−γ , while that at constant volume diverges only as (T − Tc )−α (p. 256, p. 255), this ratio becomes increasingly large on approaching Tc . The expression (9.4.51), valid in the limit of small k (scattering in the forward direction), exhibits the phenomenon of critical opalescence, as a result of (∂n/∂P )T ∝ (T − Tc )−γ . ∗

9.5 Supplementary Remarks

9.5.1 Relaxation-Time Approximation The general evaluation of the eigenvalues and eigenfunctions of the linear collision operator is complicated. On the other hand, since not all the eigenfunctions contribute to a particular diﬀusion process and certainly the ones with the largest weight are those whose eigenvalues ωλ are especially small, we can as an approximation attempt to characterize the collision term through only one characteristic frequency,

∂ 1 + v∇ f (x, v, t) = − (f (x, v, t) − f (x, v, t)) . (9.5.1) ∂t τ This approximation is called the conserved relaxation time approximation, since the right-hand side represents the diﬀerence between the distribution function and a local Maxwell distribution. This takes into account the fact that the collision term vanishes when the distribution function is equal to the local Maxwell distribution. The local quantities n(x, t), ui (x, t) and e(x, t) which occur in f (x, v, t) can be calculated from f (x, v, t) using Eqns. (9.3.15a), (9.3.15b), and (9.3.15c ). Our goal is now to calculate f or f − f . We write

∂ ∂ 1 + v∇ (f − f ) + + v∇ f = − (f − f ) . (9.5.2) ∂t ∂t τ In the hydrodynamic region, ωτ 1, vkτ 1, we can neglect the ﬁrst term on the left-handside of (9.5.2) compared to the term on the right side, ∂ obtaining f − f = τ ∂t + v∇ f . Therefore, the distribution function has the form

∂ + v∇ f , (9.5.3) f =f +τ ∂t

and, using this result, one can again calculate the current densities, in an extension of Sect. 9.3.5.2. In zeroth order, we obtain the expressions found in (9.3.35) and (9.3.36) for the reversible parts of the pressure tensor and the remaining current densities. The second term gives additional contributions to the pressure tensor, and also yields a ﬁnite heat current. Since f depends

∗

9.5 Supplementary Remarks

469

on x and t only through the three functions n(x, t), T (x, t), and u(x, t), the second term depends on these and their derivatives. The time derivatives of f or n, T , and u can be replaced by the zero-oder equations of motion. The corrections therefore are of the form ∇n(x, t), ∇T (x, t), and ∇ui (x, t). Along with the derivatives of Pij and q which already occur in the equations of motion, the additional terms in the equations are of the type τ ∇2 T (x, t) etc. (See problem 9.13). 9.5.2 Calculation of W (v1 , v2 ; v1 , v2 ) The general results of the Boltzmann equation did not depend on the precise form of the collision probability, but instead only the general relations (9.2.8a–f) were required. For completeness, we give the relation between W (v1 , v2 ; v1 , v2 ) and the scattering cross-section for two particles18 . It is assumed that the two colliding particles interact via a central potential w(x1 − x2 ). We treat the scattering process v1 , v2 ⇒ v1 , v2 , in which particles 1 and 2, with velocities v1 and v2 before the collision, are left with the velocities v1 and v2 following the collision (see Fig. 9.4). The conservation laws for momentum and energy apply; owing to the equality of the two masses, they are given by v1 + v2 = v1 + v2 v12

+

v22

=

2 v1

+

2 v2

(9.5.4a) .

(9.5.4b)

Fig. 9.4. The collision of two particles 18

The theory of scattering in classical mechanics is given for example in L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics, Vol. I: Mechanics, 3rd Ed. (Butterworth–Heinemann, London 1976), or H. Goldstein, Classical Mechanics, 2nd Ed. (Addison–Wesley, New York 1980).

470

9. The Boltzmann Equation

It is expedient to introduce the center-of-mass and relative velocities; before the collision, they are V=

1 (v1 + v2 ) , 2

u = v1 − v2 ,

(9.5.5a)

u = v1 − v2 .

(9.5.5b)

and after the collision, V =

1 (v + v2 ) , 2 1

Expressed in terms of these velocities, the two conservation laws have the form V = V

(9.5.6a)

|u| = |u | .

(9.5.6b)

and

In order to recognize the validity of (9.5.6b), one need only subtract the square of (9.5.4a) from two times Eq. (9.5.4b). The center-of-mass velocity does not change as a result of the collision, and the (asymptotic) relative velocity does not change its magnitude, but it is rotated in space. For the velocity transformations to the center-of-mass frame before and after the collision given in (9.5.5a) and (9.5.5b), the volume elements in velocity space obey the relations d3 v1 d3 v2 = d3 V d3 u = d3 V d3 u = d3 v1 d3 v2

(9.5.7)

due to the fact that the Jacobians have unit value. The scattering cross-section can be most simply computed in the centerof-mass frame. As is known from classical mechanics,18 the relative coordinate x obeys an equation of motion in which the mass takes the form of a reduced mass µ (here µ = 12 m) and the potential enters as a central potential w(x). Hence, one obtains the scattering cross-section in the center-of-mass frame from the scattering of a ﬁctitious particle of mass µ by the potential w(x). We ﬁrst write down the velocities of the two particles in the center-of-mass frame before and after the collision v1s = v1 − V =

1 u, 2

1 v2s = − u , 2

v1s =

1 u , 2

1 v2s = − u . (9.5.8) 2

We now recall some concepts from scattering theory. The equivalent potential scattering problem is represented in Fig. 9.5, and we can use it to deﬁne the scattering cross-section. The orbital plane of the particle is determined by the asymptotic incoming velocity u and position of the scattering center O. This follows from the conservation of angular momentum in the central potential. The z-axis of the coordinate system drawn in Fig. 9.5 passes

∗

9.5 Supplementary Remarks

471

Fig. 9.5. Scattering by a ﬁxed potential, with collision parameter s and scattering center O. The particles which impinge on the surface element s ds dϕ are deﬂected into the solid angle element dΩ

through the scattering center O and is taken to be parallel to u. The orbit of the incoming particle is determined by the collision parameter s and the angle ϕ. In Fig. 9.5, the orbital plane which is deﬁned by the angle ϕ lies in the plane of the page. We consider a uniform beam of particles arriving at various distances s from the axis with the asymptotic incoming velocity u. The intensity I of this beam is deﬁned as the number of particles which impinge per second on one cm2 of the perpendicular surface shown. Letting n be the number of particles per cm3 , then I = n|u|. The particles which impinge upon the surface element deﬁned by the collision parameters s and s + ds and the diﬀerential element of angle dϕ are deﬂected into the solidangle element dΩ. The number of particles arriving in dΩ per unit time is denoted by dN (Ω). The diﬀerential scattering cross-section σ(Ω, u), which of course also depends upon u, is deﬁned by dN (Ω) = Iσ(Ω, u)dΩ, or σ(Ω, u) = I −1

dN (Ω) . dΩ

(9.5.9)

Owing to the cylinder symmetry of the beam around the z-axis, σ(Ω, u) = σ(ϑ, u) is independent of ϕ. The scattering cross-section in the center-of-mass system is obtained by making the replacement u = |v1 − v2 |. The collision parameter s uniquely determines the orbital curve, and therefore the scattering angle: dN (Ω) = Isdϕ(−ds) .

(9.5.10)

From this it follows using dΩ = sin ϑdϑdϕ that σ(Ω, u) = −

1 ds 1 1 ds2 s =− . sin ϑ dϑ sin ϑ 2 dϑ

(9.5.11)

From ϑ(s) or s(ϑ), we obtain the scattering cross-section. The scattering angle ϑ and the asymptotic angle ϕa are related by ϑ = π − 2ϕa (cf. Fig. 9.6).

or ϕa =

1 (π − ϑ) 2

(9.5.12)

472

9. The Boltzmann Equation

Fig. 9.6. The scattering angle (deﬂection angle) ϑ and the asymptotic angle ϕa

In classical mechanics, the conservation laws for energy and angular momentum give ∞ ∞ l s ϕa = dr 9 dr 9 ; l2 = 2 rmin rmin r2 1 − rs2 − 2w(r) r2 2µ E − w(r) − r2 µu2 (9.5.13) here, we use l = µsu

(9.5.14a)

to denote the angular momentum and µ (9.5.14b) E = u2 2 for the energy, expressed in terms of the asymptotic velocity. The distance rmin of closest approach to the scattering center is determined from the condition (r˙ = 0): w(rmin ) +

l2 =E. 2 2µrmin

(9.5.14c)

As an example, we consider the scattering of two hard spheres of radius R. In this case, we have

ϑ ϑ π s = 2R sin ϕa = 2R sin − = 2R cos , 2 2 2 from which, using (9.5.11), we ﬁnd σ(ϑ, u) = R2 .

(9.5.15)

In this case, the scattering cross-section is independent of the deﬂection angle and of u, which is otherwise not the case, as is known for example from Rutherford scattering18 . After this excursion into classical mechanics, we are in a position to calculate the transition probability W (v, v2 ; v3 , v4 ) for the loss and gain processes in Eqns. (9.2.5) and (9.2.6). To calculate the loss rate, we recall the following assumptions: (i) The forces are assumed to be short-ranged, so that only particles within the same volume element d3 x1 will scatter each other.

∗

9.5 Supplementary Remarks

473

(ii) When particle 1 is scattered, it leaves the velocity element d3 v1 . To calculate the loss rate l, we pick out a molecule in d3 x which has the velocity v1 and take it to be the scattering center on which molecule 2 with velocity v2 in the velocity element d3 v2 impinges. The ﬂux of such particles is f (x, v2 , t)|v2 −v1 |d3 v2 . The number of particles which impinge on the surface element (−s ds)dϕ per unit time is f (x, v2 , t)|v2 − v1 |d3 v2 (−s ds)dϕ = = f (x, v2 , t)|v2 − v1 |d3 v2 σ(Ω, |v1 − v2 |)dΩ . In order to obtain the number of collisions which the particles within d3 xd3 v1 experience in the time interval dt, we have to multiply this result by f (x, v1 , t)d3 xd3 v1 dt and then integrate over v2 and all deﬂection angles dΩ:

ld xd v1 dt = 3

3

3

d v2

dΩf (x, v1 , t)f (x, v2 , t)|v2 − v1 | × × σ Ω, |v1 − v2 | d3 xd3 v1 dt .

(9.5.16)

To calculate the gain rate g, we consider scattering processes in which a molecule of given velocity v1 is scattered into a state with velocity v1 by a collision with some other molecule: gd3 xd3 v1 dt = dΩ d3 v1 d3 v2 |v1 − v2 | σ Ω, |v1 − v2 | × × f (x, v1 , t)f (x, v2 , t)d3 xdt . (9.5.17) The limits of the velocity integrals are chosen so that the velocity v1 lies within the element d3 v1 . Using (9.5.7), we obtain for the right side of (9.5.17) 3 3 d v1 d v2 dΩ|v1 − v2 | σ Ω, |v1 − v2 | f (x, v1 , t)f (x, v2 , t)d3 xdt , i.e.

g=

3

d v2

dΩ|v1 − v2 | σ Ω, |v1 − v2 | f (x, v1 , t)f (x, v2 , t) . (9.5.18)

Here, we have also taken account of the fact that the scattering cross-section for the scattering of v1 , v2 → v1 , v2 is equal to that for v1 , v2 → v1 , v2 , since the two events can be transformed into one another by a reﬂection in space and time. As a result, we ﬁnd for the total collision term: ∂f = g−l = d3 v2 dΩ |v2 − v1 | σ Ω, |v2 − v1 | f1 f2 −f1 f2 . (9.5.19) ∂t coll

474

9. The Boltzmann Equation

The deﬂection angle ϑ can be expressed as follows in terms of the asymptotic relative velocities18 : ϑ = arccos

(v1 − v2 )(v1 − v2 ) . |v1 − v2 ||v1 − v2 |

The integral dΩ refers to an integration over the direction of u . With the rearrangements u − u2 = v1 − 2v1 v2 + v2 − v12 + 2v1 v2 − v22 2

2

2

= −4V + 2v1 + 2v2 + 4V2 − 2v12 − 2v22 = 2(v1 + v2 − v12 − v22 ) 2

2

2

2

2

and

dΩ |v2 − v1 | = dΩ u = du dΩ δ(u − u)u * ) 2 u2 u 2 − = du u dΩ δ 2 2 * ) 2 u2 u d3 V δ (3) (V − V) = d3 u δ − 2 2 * ) 2 2 v1 2 + v2 2 v1 + v2 3 3 − δ (3) (v1 + v2 − v1 − v2 ) , = 4 d v1 d v2 δ 2 2

which also imply the conservation laws, we obtain g − l = d3 v2 d3 v1 d3 v2 W (v1 , v2 ; v1 , v2 )(f1 f2 − f1 f2 ) .

(9.5.20)

In this expression, we use ) W (v1 , v2 ; v1 , v2 )

= 4σ(Ω, |v2 − v1 |)δ

v1 2 + v2 2 v1 + v2 − 2 2 2

2

* ×

× δ (3) (v1 + v2 − v1 − v2 ) . (9.5.21) Comparison with Eq. (9.2.8f) yields σ(v1 , v2 ; v1 , v2 ) = 4m4 σ(Ω, |v2 − v1 |) .

(9.5.22)

From the loss term in (9.5.19), we can read oﬀ the total scattering rate for particles of velocity v1 : 1 3 = d v2 dΩ |v2 − v1 | σ Ω, |v2 − v1 | f (x, v2 , t) . (9.5.23) τ (x, v, t)

∗

9.5 Supplementary Remarks

475

The expression for τ −1 corresponds to the estimate in Eq. (9.2.12), which was derived by elementary considerations: τ −1 = nvth σtot , with rmax σtot = dΩσ Ω, |v2 − v1 | = 2π ds s . (9.5.24) 0

rmax is the distance from the scattering center for which the scattering angle goes to zero, i.e. for which no more scattering occurs. In the case of hard spheres, from Eq. (9.5.15) we have σtot = 4πR2 .

(9.5.25)

For potentials with inﬁnite range, rmax diverges. In this case, the collision term has the form ∞ 2π ∂f 3 = d v2 ds s dϕ(f1 f2 − f1 f2 )|v1 − v2 | . (9.5.26) ∂t coll 0 0 Although the individual contributions to the collision term diverge, the overall term remains ﬁnite: rmax lim ds s (f1 f2 − f1 f2 ) = ﬁnite , rmax →∞

0

since for s → ∞, the deﬂection angle tends to 0, and v1 − v1 → 0 and v2 − v2 → 0, so that (f1 f2 − f1 f2 ) → 0 .

Literature P. R´esibois and M. De Leener, Classical Kinetic Theory of Fluids (John Wiley, New York, 1977). K. Huang, Statistical Mechanics, 2nd Ed. (John Wiley, New York, 1987). L. Boltzmann, Vorlesungen u ¨ber Gastheorie, Vol. 1: Theorie der Gase mit einatomigen Molek¨ ulen, deren Dimensionen gegen die mittlere Wegl¨ ange verschwinden (Barth, Leipzig, 1896); or Lectures on Gas Theory, transl. by S. Brush, University of California Press, Berkeley 1964. R. L. Liboﬀ, Introduction to the Theory of Kinetic Equations, Robert E. Krieger publishing Co., Huntington, New York 1975. S. Harris, An Introduction to the Theory of the Boltzmann Equation, Holt, Rinehart and Winston, New York 1971. J. A. McLennan, Introduction to Non-Equilibrium Statistical Mechanics, PrenticeHall, Inc., London 1988. K. H. Michel and F. Schwabl, Hydrodynamic Modes in a Gas of Magnons, Phys. Kondens. Materie 11, 144 (1970).

476

9. The Boltzmann Equation

Problems for Chapter 9 9.1 Symmetry Relations. Demonstrate the validity of the identity (9.3.5) used to prove the H theorem: Z Z Z Z d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )(f1 f2 − f3 f4 )ϕ1 Z Z Z Z 1 = d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ;v3 , v4 ) 4 × (f1 f2 − f3 f4 )(ϕ1 + ϕ2 − ϕ3 − ϕ4 ) . (9.5.27) 9.2 The Flow Term in the Boltzmann Equation. Carry out the intermediate steps which lead from the equation of continuity (9.2.11) for the single-particle distribution function in µ−space to Eq. (9.2.11 ). 9.3 The Relation between H and S. Calculate the quantity Z

H(x, t) =

d3 v f (x, v, t) log f (x, v, t)

for the case that f (x, v, t) is the Maxwell distribution.

9.4 Show that in the absence of an external force, the equation of continuity (9.3.28) can be brought into the form (9.3.32) n(∂t + uj ∂j )e + ∂j qj = −Pij ∂i uj .

9.5 The Local Maxwell Distribution. Conﬁrm the statements made following Eq. (9.3.19 ) by inserting the local Maxwell distribution (9.3.19 ) into (9.3.15a)– (9.3.15c). 9.6 The Distribution of Collision Times. Consider a spherical particle of radius r, which is passing with velocity v through a cloud of similar particles with a particle density n. The particles deﬂect each other only when they come into direct contact. Determine the probability distribution for the event in which the particle experiences its ﬁrst collision after a time t. How long is the mean time between two collisions? 9.7 Equilibrium Expectation Values. Conﬁrm the results (G.1c) and (G.1g) for „

Z

d3 v

mv 2 2

«s

Z

f 0 (v)

and

d3 v vk vi vj vl f 0 (v) .

9.8 Calculate the scalar products used in Sect. 9.4.2: 1|1, |1, |, vi |vj , χ ˆ5 |χ ˆ4 , χ ˆ4 |vi χ ˆj , χ ˆ5 |vi χ ˆj , χ ˆ4 |vi2 χ ˆ4 , and vj |vi χ ˆ4 . 9.9 Sound Damping. In (9.4.30 ), (9.4.37 ) and (9.4.42 ), the linearized hydrodynamic equations for an ideal gas were derived. For real gases and liquids with general equations of state P = P (n, T ), analogous equations hold:

∂ T (x, t) + n ∂t

„

∂T ∂n

∂ n(x, t) + n∇ · u(x, t) = 0 ∂t ∂ mn uj (x, t) + ∂i Pji (x, t) = 0 ∂t « ∇ · u(x, t) − D∇2 T (x, t) = 0 . S

Problems for Chapter 9

477

The pressure tensor Pij , with components „ « 2 Pij = δij P − η (∇j ui + ∇i uj ) + η − ζ δij ∇ · u 3 now however contains an additional term on the diagonal, −ζ∇·u. This term results from the fact that real gases have a nonvanishing bulk viscosity (or compressional viscosity) ζ in addition to their shear viscosity η. Determine and discuss the modes. Hint: Keep in mind that the equations partially decouple if one separates the velocity ﬁeld into transverse and longitudinal components: u = ut +ul with ∇·ut = 0 and ∇ × ul = 0. (This can be carried out simply in Fourier space without loss of generality by taking the wavevector to lie along the z-direction.) In order to evaluate the dispersion equations (eigenfrequencies ω(k)) for the Fourier transforms of n, ul , and T , one can consider approximate solutions for ω(k) of successively increasing order in the magnitude of the wavevector k. A useful abbreviation is « « » « .„ « – „ « „ „ „ ∂T cP ∂T ∂P ∂P ∂P 1− = = . mc2s = ∂n S ∂n T ∂n S ∂n P ∂n T cV Here, cs is the adiabatic velocity of sound.

9.10 Show that D E E p p ˛ 1 vj ˛˛ 1 ˛vl pvr p p vi p = δji kT /m , = δlr kT /m , n/m n/kT n/kT n/m r r D v E D E D ˛ ˛ 4 2kT vr E 2kT j i 4 4˛ ˛ p ˆ = δij vi |χ ˆχ ˆ = δij vi χ = δlr , χ ˆ vl p 3m 3m n/m n/m

D

and verify (9.4.40 ). R d3 x dt e−i(kx−ωt) n(x, t)n(0, 0) and conﬁrm the result in (9.4.51) by transforming to Fourier space and expressing the ﬂuctuations at a given time in terms of thermodynamic derivatives (see also QM II, Sect. 4.7).

9.11 Calculate the density-density correlation function Snn (k, ω) =

R

9.12 The Viscosity of a Dilute Gas. In Sect. 9.4, the solution of the linearized Boltzmann equation was treated by using an expansion in terms of the eigenfunctions of the collision operator. Complete the calculation of the dissipative part of the momentum current, Eq. (9.4.40). Show that 5 X

p

λ=1

vr 5kT vj . |vi χ = δij δlr ˆλ χ ˆλ |vl p 3m n/m n/m

9.13 Heat Conductivity Using the Relaxation-Time Approach. A further possibility for the approximate determination of the dissipative contributions to the equations of motion for the conserved quantities particle number, momentum and energy is found in the relaxation-time approach introduced in Sect. 9.5.1: « f − f ∂f . =− ∂t collision τ For g = f − f , one obtains in lowest order from the Boltzmann equation (9.5.1)

478

9. The Boltzmann Equation « „ 1 g(x, v, t) = −τ ∂t + v · ∇ + F · ∇v f (x, v, t) . m

Eliminate the time derivative of f by employing the non-dissipative equations of motion obtained from f and determine the heat conductivity by inserting f = f +g into the expression for the heat current q derived in (9.3.29).

9.14 The Relaxation-Time Approach for the Electrical Conductivity. Consider an inﬁnite system of charged particles immersed in a positive background. The collision term describes collisions of the particles among themselves as well as with the (ﬁxed) ions of the background. Therefore, the collision term no longer vanishes for general local Maxwellian distributions f (x, v, t). Before the application of a weak homogeneous electric ﬁeld E, take f = f 0 , where f 0 is the positionand time-independent Maxwell distribution. Apply the relaxation-time approach ∂f /∂t|coll = −(f − f 0 )/τ and determine the new equilibrium distribution f to ﬁrst order in E after application of the ﬁeld. What do you ﬁnd for v? Generalize to a time-dependent ﬁeld E(t) = E0 cos(ωt). Discuss the eﬀects of the relaxation-time approximation on the conservation laws (see e.g. John M. Ziman, Principles of the Theory of Solids, 2nd Ed. (Cambridge University Press, Cambridge 1972)).

9.15 An example which is theoretically easy to treat but is unrealistic for atoms is the purely repulsive potential19 w(r) =

1 κ , ν − 1 r ν−1

ν ≥ 2, κ > 0 .

(9.5.28)

Show that the corresponding scattering cross-section has the form „ σ(ϑ, |v1 − v2 |) =

2κ m

«

2 ν−1

4

|v1 − v2 |− ν−1 Fν (ϑ) ,

(9.5.29)

with functions Fν (ϑ) which depend on ϑ and the power ν. For the special case of the so called Maxwell potential (ν = 5), |v1 − v2 |σ(ϑ, |v1 − v2 |) is independent of |v1 − v2 |.

9.16 Find the special local Maxwell distributions « „ v2 f 0 (v, x, t) = exp A + B · v + C 2m which are solutions of the Boltzmann equation, by comparing the coeﬃcients of the powers of v. The result is A = A1 + A2 · x + C3 x2 , B = B1 − A2 t − (2C3 t + C2 )x + Ω × x, C = C1 + C2 t + C3 t2 .

9.17 Let an external force F(x) = −∇V (x) act in the Boltzmann equation. Show that the collision term and the ﬂow term vanish for the case of the Maxwell distribution function „ «– » “ m ”3/2 m(v − u)2 1 . + V (x) exp − f (v, x) ∝ n 2πkT kT 2

9.18 Verify Eq. (9.4.33b). 19

Landau/Lifshitz, Mechanics, p. 51, op. cit. in footnote 18.

10. Irreversibility and the Approach to Equilibrium

10.1 Preliminary Remarks In this chapter, we will consider some basic aspects related to irreversible processes and their mathematical description, and to the derivation of macroscopic equations of motion from microscopic dynamics: classically from the Newtonian equations, and quantum-mechanically from the Schr¨odinger equation. These microscopic equations of motion are time-reversal invariant, and the question arises as to how it is possible that such equations can lead to expressions which do not exhibit time-reversal symmetry, such as the Boltzmann equation or the heat diﬀusion equation. This apparent incompatibility, which historically was raised in particular by Loschmidt as an objection to the Boltzmann equation, is called the Loschmidt paradox. Since during his lifetime the reality of atoms was not experimentally veriﬁable, the apparent contradiction between the time-reversal invariant (time-reversal symmetric) mechanics of atoms and the irreversibility of non-equilibrium thermodynamics was used by the opponents of Boltzmann’s ideas as an argument against the very existence of atoms1 . A second objection to the Boltzmann equation and to a purely mechanical foundation for thermodynamics came from the fact – which was proved with mathematical stringence by Poincar´e – that every ﬁnite system, no matter how large, must regain its initial state periodically after a so called recurrence time. This objection was named the Zermelo paradox, after its most vehement protagonist. Boltzmann was able to refute both of these objections. In his considerations, which were carried further by his student P. Ehrenfest2 , probability arguments play an important role, as they do in all areas of statistical mechanics – a way of thinking that was however foreign to the mechanistic worldview of physics at that time. We mention at this point that the entropy which is deﬁned in Eq. (2.3.1) in terms of the density matrix does not change within a closed system. In this chapter, we will denote the entropy deﬁned in this way as the Gibbs’ entropy. Boltzmann’s 1

2

See also the preface by H. Thirring in E. Broda, Ludwig Boltzmann, Deuticke, Wien, 1986. See P. Ehrenfest and T. Ehrenfest, Begriﬄiche Grundlagen der statistischen Auffassung in der Mechanik, Encykl. Math. Wiss. 4 (32) (1911); English translation by M. J. Moravcsik: The Conceptual Foundations of the Statistical Approach in Mechanics, Cornell University Press, Ithaca, NY 1959.

480

10. Irreversibility and the Approach to Equilibrium

concept of entropy, which dates from an earlier time, associates a particular value of the entropy not only to an ensemble but also to each microstate, as we shall show in more detail in Sect. 10.6.2. In equilibrium, Gibbs’ entropy is equal to Boltzmann’s entropy. To eliminate the recurrence-time objection, we will estimate the recurrence time on the basis of a simple model. Using a second simple model of the Brownian motion, we will investigate how its time behavior depends on the particle number and the diﬀerent time scales of the constituents. This will lead us to a general derivation of macroscopic hydrodynamic equations with dissipation from time-reversal invariant microscopic equations of motion. Finally, we will consider the tendency of a dilute gas to approach equilibrium, and its behavior under time reversal. In this connection, the inﬂuence of external perturbations will also be taken into account. In addition, this chapter contains an estimate of the size of statistical ﬂuctuations and a derivation of Pauli’s master equations. In this chapter, we treat a few signiﬁcant aspects of this extensive area of study. On the one hand, we will examine some simple models, and on the other, we will present qualitative considerations which will shed light on the subject from various sides. In order to illuminate the problem arising from the Loschmidt paradox, we show the time development of a gas in Fig. 10.1. The reader may conjecture that the time sequence is a,b,c, in which the gas expands to ﬁll the total available volume. If on the other hand a motion reversal is carried out at conﬁguration c, then the atoms will move back via stage b into conﬁguration a, which has a lower entropy. Two questions arise from this situation: (i) Why is the latter sequence (c,b,a) in fact never observed? (ii) How are we to understand the derivation of the H theorem, according to which the entropy always increases?

(a)

(b)

(c)

Fig. 10.1. Expansion or contraction of a gas: total volume V , subvolume V1 (cube in lower-left corner)

10.2 Recurrence Time

481

10.2 Recurrence Time Zermelo (1896)3 based his criticism of the Boltzmann equation on Poincar´e’s recurrence-time theorem4 . It states that a closed, ﬁnite, conservative system will return arbitrarily closely to its initial conﬁguration within a ﬁnite time, the Poincar´e recurrence time τP . According to Zermelo’s paradox, H(t) could not decrease monotonically, but instead must ﬁnally again increase and regain the value H(0). To adjudge this objection, we will estimate the recurrence time with the aid of a model5 . We consider a system of classical harmonic oscillators (linear chain) with displacements qn , momenta pn and the Hamiltonian (see QM II, Sect. 12.1): # N " 1 2 mΩ 2 2 p + (qn − qn−1 ) H= . 2m n 2 n=1

(10.2.1)

From this, the equations of motion are obtained: qn = mΩ 2 (qn+1 + qn−1 − 2qn ) . p˙ n = m¨

(10.2.2)

Assuming periodic boundary conditions, q0 = qN , we are dealing with a translationally invariant problem, which is diagonalized by the Fourier transformation m 1/2 1 isn qn = e Q , p = e−isn Ps . (10.2.3) s n N (mN )1/2 s s Qs and (Ps ) are called the normal coordinates (and momenta). The periodic boundary conditions require that 1 = eisN , i.e. s = 2πl N with integral l. The values of s for which l diﬀers by N are equivalent. A possible choice of values of l, e.g. for odd N , would be: l = 0, ±1, . . . , ±(N − 1)/2. Since qn and pn are real, it follows that Q∗s = Q−s and Ps∗ = P−s . The Fourier coeﬃcients obey the orthogonality relations N 1 isn −is n e e = ∆(s − s ) = N n=1

1 0

for s − s = 2πh with h integral otherwise (10.2.4)

3 4 5

E. Zermelo, Wied. Ann. 57, 485 (1896); ibid. 59, 793 (1896). H. Poincar´e, Acta Math. 13, 1 (1890) P. C. Hemmer, L. C. Maximon, and H. Wergeland, Phys. Rev. 111, 689 (1958).

482

10. Irreversibility and the Approach to Equilibrium

and the completeness relation 1 −isn isn e e = δnn . N s

(10.2.5)

Insertion of the transformation to normal coordinates yields 1 Ps Ps∗ + ωs2 Qs Q∗s H= 2 s

(10.2.6)

with the dispersion relation s ωs = 2Ω | sin | . 2

(10.2.7)

We thus ﬁnd N non-coupled oscillators with eigenfrequencies6 ωs . The motion of the normal coordinates can be represented most intuitively by introducing complex vectors Zs = Ps + iωs Qs ,

(10.2.8)

which move on a unit circle according to Zs = as eiωs t

(10.2.9)

with a complex amplitude as (Fig. 10.2).

Fig. 10.2. The motion of the normal coordinates

We assume that the frequencies ωs of N − 1 such normal coordinates are incommensurate, i.e. their ratios are not rational numbers. Then the phase vectors Zs rotate independently of one another, without coincidences. We now wish to calculate how much time passes until all N vectors again come into their initial positions, or more precisely, until all the vectors lie within an interval ∆ϕ around their initial positions. The probability that the vector Zs lies within ∆ϕ during one rotation is given by ∆ϕ/2π, and the probability that all the vectors lie within their respective prescribed intervals is (∆ϕ/2π)N −1 . The number of rotations required for this recurrence is thereN −1 . The recurrence time is found by multiplying by the typical fore (2π/∆ϕ) 6

The normal coordinate with s = 0, ωs = 0 corresponds to a translation and need not be considered in the following.

10.2 Recurrence Time

rotational period7

τP ≈

2π ∆ϕ

483

1 ω:

N −1 ·

1 . ω

(10.2.10)

2π , N = 10 and ω = 10 Hz, we obtain τP ≈ 1012 years, Taking ∆ϕ = 100 i.e. more than the age of the Universe. These times of course become much longer if we consider a macroscopic system with N ≈ 1020 . The recurrence thus exists theoretically, but in practice it plays no role. We have thereby eliminated Zermelo’s paradox.

Remark: We consider further the time dependence of the solution for the coupled oscillators. From (10.2.3) and (10.2.9) we obtain qn (t) =

” X eisn “ Q˙ s (0) √ Qs (0) cos ωs t + sin ωs t , ωs Nm s

(10.2.11)

from which the following solution of the general initial-value problem is found: ` ´ q˙ (0) ` ´” 1 X“ qn (t) = sin s(n−n )−ωs t . (10.2.12) qn (0) cos s(n−n )−ωs t + n N ωs s,n

As an example, we consider the particular initial condition qn = δn ,0 , q˙n (0) = 0, for which only the oscillator at the site 0 is displaced initially, leading to qn (t) =

` 1 X s ´ cos sn − 2Ω t | sin | . N s 2

(10.2.13)

As long as N is ﬁnite, the solution is quasiperiodic. On the other hand, in the limit N →∞ Z π Z π ` ` ´ 1 1 s ´ ds cos sn − 2Ω t | sin | = ds cos s2n − 2Ω t sin s qn (t) = 2π −π 2 π 0 r ` π´ 1 cos 2Ω t − πn − for long t . (10.2.14) = J2n (2Ω t) ∼ πΩ t 4 Jn are Bessel functions8 . The excitation does not decay exponentially, but instead algebraically as t−1/2 . We add a few more remarks concerning the properties of the solution (10.2.13) for ﬁnite N . If the zeroth atom in the chain is released at the time t = 0, it swings back and its neighbors begin to move upwards. The excitation propagates along the chain at the velocity of sound, aΩ; the n-th atom, at a distance d = na n . Here, a is the lattice constant. from the origin, reacts after a time of about t ∼ Ω 7

8

A more precise formula by P. C. Hemmer, L. C. Maximon, and H. Wergeland, QN−1 2π s=1 1 ∆ϕs ∝ ∆ϕ2−N . op. cit. 5, yields τP = PN−1 ωs N s=1 ∆ϕs I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series and Products, Academic Press, New York, 1980, 8.4.11 and 8.4.51

484

10. Irreversibility and the Approach to Equilibrium

The displacement amplitude remains largest for the zeroth atom. In a ﬁnite chain, there would be echo eﬀects. For periodic boundary conditions, the radiated oscillations come back again to the zeroth atom. The limit N → ∞ prevents Poincar´e recurrence. The displacement energy of the zeroth atom initially present is divided up among the inﬁnitely many degrees of freedom. The decrease of the oscillation amplitude of the initially excited atom is due to energy transfer to its neighbors.

10.3 The Origin of Irreversible Macroscopic Equations of Motion In this section, we investigate a microscopic model of Brownian motion. We will ﬁnd the appearance of irreversibility in the limit of inﬁnitely many degrees of freedom. The derivation of hydrodynamic equations of motion in analogy to the Brownian motion will be sketched at the end of this section and is given in more detail in Appendix H.. 10.3.1 A Microscopic Model for Brownian Motion As a microscopic model for Brownian motion, we consider a harmonic oscillator which is coupled to a harmonic lattice9 . Since the overall system is harmonic, the Hamiltonian function or the Hamiltonian operator as well as the equations of motion and their solutions have the same form classically and quantum mechanically. We start with the quantum-mechanical formulation. In contrast to the Langevin equation of Sect. 8.1, where a stochastic force was assumed to act on the Brownian particle, we now take explicit account of the many colliding particles of the lattice in the Hamiltonian operator and in the equations of motion. The Hamiltonian of this system is given by H = HO + HF + HI , 1 2 M Ω2 2 1 2 1 P + Q , HF = p + Φnn qn qn , 2M 2 2m n n 2 (10.3.1) nn HI = cn qn Q ,

HO =

n

where HO is the Hamiltonian of the oscillator of mass M and frequency Ω. Furthermore, HF is the Hamiltonian of the lattice10 with masses m, momenta pn , and displacements qn from the equilibrium positions, where we take m M . The harmonic interaction coeﬃcients of the lattice atoms are Φnn . The interaction of the oscillator with the lattice atoms is given by HI ; 9

10

The coupling to a bath of oscillators as a mechanism for damping has been investigated frequently, e.g. by F. Schwabl and W. Thirring, Ergeb. exakt. Naturwiss. 36, 219 (1964); A. Lopez, Z. Phys. 192, 63 (1965); P. Ullersma, Physica 32, 27 (1966). We use the index F , since in the limit N → ∞ the lattice becomes a ﬁeld.

10.3 The Origin of Irreversible Macroscopic Equations of Motion

485

the coeﬃcients cn characterize the strength and the range of the interactions of the oscillator which is located at the origin of the coordinate system. The vector n enumerates the atoms of the lattice. The equations of motion which follow from (10.3.1) are given by ¨ = −M Ω 2 Q − MQ cn qn n

and m¨ qn = −

Φnn qn − cn Q .

(10.3.2)

n

We take periodic boundary conditions, qn = qn+Ni , with N1 = (N1 , 0, 0), N2 = (0, N2 , 0), and N3 = (0, 0, N3 ), where Ni is the number of atoms in ˆi . Due to the translational invariance of HF , we introduce the the direction e following transformations to normal coordinates and momenta: m −ikan 1 ikan qn = √ e Q k , pn = e Pk . (10.3.3) N mN k

k

The inverse transformation is given by m −ikan 1 ikan e qn , Pk = √ e pn . Qk = N n mN n

(10.3.4)

The Fourier coeﬃcients obey orthogonality and completeness relations 1 i(k−k )·an 1 ik·(an −an ) e = ∆(k − k ) , e = δn,n (10.3.5a,b) N n N k 1 for k = g with the generalized Kronecker delta ∆(k) = . 0 otherwise From the periodic boundary conditions we ﬁnd the following values for the wavevector: r1 r2 r3 k = g1 + g2 + g3 with ri = 0, ±1, ±2, ... . N1 N2 N3 Here, we have introduced the reciprocal lattice vectors which are familiar from solid-state physics: g1 =

2π 2π 2π , 0, 0 , g2 = 0, , 0 , g3 = 0, 0, . a a a

The transformation to normal coordinates (10.3.3) converts the Hamiltonian for the lattice into the Hamiltonian for N decoupled oscillators, viz. 1 † HF = Pk Pk + ωk2 Q†k Qk , (10.3.6) 2 k

486

10. Irreversibility and the Approach to Equilibrium

with the frequencies11 (see Fig. 10.3) ωk2 =

1 Φ(n) e−ikan . m n

(10.3.7)

Fig. 10.3. The frequencies ωk along one of the coordinate axes, ωmax = ωπ/a

From the invariance of the lattice with respect to inﬁnitesimal trans lations, we obtain the condition n Φ(n, n ) = 0, and from translational invariance with respect to lattice vectors t, it follows that Φ(n + t, n + t) = Φ(n, n ) = Φ(n − n ). The latter relation was already used in (10.3.7). From the ﬁrst of the two relations, we ﬁnd limk→0 ωk2 = 0, i.e. the oscillations of the lattice are acoustic phonons. Expressed in terms of the normal coordinates, the equations of motion are (10.3.2) ¨ = −M Ω 2 Q − √ 1 MQ c(k)∗ Qk mN k ¨ k = −mω 2 Qk − m c(k) Q mQ k N

(10.3.8a) (10.3.8b)

with c(k) =

cn e−ik an .

(10.3.9)

n

For the further treatment of the equations of motion (10.3.8a,b) and the solution of the initial-value problem, we introduce the half-range Fourier transform (Laplace transform) of Q(t): ∞ ∞ ˜ Q(ω) ≡ dt eiωt Q(t) = dt eiωt Θ(t)Q(t) . (10.3.10a) 0

11

−∞

We assume that the harmonic potential for the heavy oscillator is based on the same microscopic interaction as that for p the lattice atoms, Φ(n, p gn ). If we denote g and ωmax = , and therefore its strength by g, then we ﬁnd Ω = M m Ω ωmax . The order of magnitude of the velocity of sound is c = aωmax .

10.3 The Origin of Irreversible Macroscopic Equations of Motion

The inverse of this equation is given by ∞ ˜ Θ(t)Q(t) = dω e−iωt Q(ω) .

487

(10.3.10b)

−∞

For free oscillatory motions, (10.3.10a) contains δ+ distributions. For their convenient treatment, it is expedient to consider ∞ ˜ Q(ω + iη) = dt ei(ω+iη)t Q(t) , (10.3.11a) 0

with η > 0. If (10.3.10a) exists, then with certainty so does (10.3.11a) owing to the factor e−ηt . The inverse of (10.3.11a) is given by ∞ ˜ + iη) , i.e. e−ηt Q(t) = dω e−iωt Q(ω −∞ ∞

Q(t)Θ(t) =

˜ + iη) . dω e−i(ω+iη)t Q(ω

(10.3.11b)

−∞

For the complex frequency appearing in (10.3.11a,b) we introduce z ≡ ω + iη. The integral (10.3.11b) implies an integration path in the complex z-plane which lies iη above the real axis

∞+iη

Q(t)Θ(t) =

˜ dz e−izt Q(z) .

(10.3.11b )

−∞+iη

The half-range Fourier transformation of the equation of motion (10.3.8a) yields for the ﬁrst term ∞ ∞ 2 izt d izt ˙ ∞ ˙ dt e Q(t) = e Q(t)|0 − iz dt eizt Q(t) dt2 0 0 ˙ ˜ = −Q(0) + izQ(0) − z 2 Q(z) . All together, for the half-range Fourier transform of the equations of motion (10.3.8a,b) we obtain 1 ˜ ˙ ˜ k (z) + M Q(0) M −z 2 + Ω 2 Q(z) = −√ c(k)∗ Q − iz Q(0) mN k (10.3.12) ˜ ˜ k (z) = − m c(k) Q(z) + m Q˙ k (0) − izQk (0) . m −z 2 + ωk2 Q N (10.3.13) ˜ k (z) and replacement of the initial values Qk (0), Q˙ k (0) The elimination of Q by qn (0), q˙n (0) yields

488

10. Irreversibility and the Approach to Equilibrium

˜ ˙ D(z) Q(z) = M Q(0) − iz Q(0) −

e−ik an m q˙n (0) − iz qn (0) (10.3.14) c(k)∗ 2 2 N n m(−z + ωk ) k

with

2 1 |c(k)|2 2 . D(z) ≡ M −z + Ω + N m(z 2 − ωk2 )

(10.3.15)

k

Now we restrict ourselves to the classical case, and insert the particular initial values for the lattice atoms qn (0) = 0, q˙n (0) = 0 for all the n12 , then we ﬁnd ˜ Q(z) =

˙ M (Q(0) − izQ(0)) . 2 2 2 −M z 2 + M Ω 2 − k |c(k)| m N /(−z + ωk )

From this, in the time representation, we obtain dω −izt ˜ e g(ων ) e−iων t , Θ(t)Q(t) = Q(z) = −i 2π ν

(10.3.16)

(10.3.17)

˜ where ων are the poles of Q(z) and g(ων ) are the residues13 . The solution is thus quasiperiodic. One could use this to estimate the Poincar´e time in analogy to the previous section. In the limit of a large particle number N , the sums over k can be replaced by integrals and a diﬀerent analytic behavior may result:14 d3 k |c(k)|2 a3 D(z) = −M z 2 + M Ω 2 + . (10.3.18) m (2π)3 z 2 − ωk2 The integral over k spans the ﬁrst Brillouin zone: − πa ≤ ki ≤ πa . For a simple evaluation of the integral over k, we replace the region of integration by a 3 1/3 2π sphere of the same volume having a radius Λ = 4π a and substitute 12

13

14

In the quantum-mechanical treatment, we would have to use the expectation value of (10.3.14) instead and insert qn (0) = q˙n (0) = 0. In problem 10.6, the force on the oscillator due to the lattice particles is investigated when the latter are in thermal equilibrium. ˜ The poles of Q(z), z ≡ ω + iη are real, i.e. they lie in the complex ω-plane below the real axis. (10.3.17) follows with the residue theorem by closure of the integration in the lower half-plane. In order to determine what the ratio of t and N must be to permit the use of the limit N → ∞ even for ﬁnite N , the N −dependence of the poles ων must be 1 the found from D(z) = 0. The distance ` 1 ´ between the poles ων is ∆ων ∼ N , and . values of the residues are of O N . The frequencies ων obey ων+1 − ων ∼ ωmax N N , the phase factors eiων t vary only weakly as a function of ν, and For t ωmax the sum over ν in (10.3.17) can be replaced by an integral.

10.3 The Origin of Irreversible Macroscopic Equations of Motion

489

the dispersion relation by ωk = c|k| where c is the velocity of sound. It then follows that Λc Λc 3 a3 1 dν ν 2 1 a 2 − |c(ν)| = dν|c(ν)|2 + m 2π 2 c3 0 z 2 − ν 2 m 2π 2 c3 0 ∞ ∞ dν |c(ν)|2 dν |c(ν)|2 2 2 (10.3.19) −z +z z2 − ν2 z2 − ν2 0 Λc with ν = c|k|. We now discuss the last equation term by term making use of the simpliﬁcation |c(ν)|2 = g 2 corresponding to cn = gδn,0 . 1st term of (10.3.19): Λc a3 1 − dν|c(ν)|2 = −g 2 Λc . (10.3.20) m 2π 2 c3 0 This yields a renormalization of the oscillator frequency 1 a3 . ω ¯ = Ω 2 − g 2 Λc 2 3 m2π c M

(10.3.21)

2nd term of (10.3.19) and evaluation using the theorem of residues: ∞ a3 1 dν 2 2 g z = −M Γ i z (10.3.22) 2 3 2 m 2π c z − ν2 0 m g 2 a3 1 = cΛ . (10.3.23) Γ = 4πmc3 M M The third term of (10.3.19) is due to the high frequencies and aﬀects the behavior at very short times. This eﬀect is treated in problem 10.5, where a continuous cutoﬀ function is employed. If we neglect it, we obtain from (10.3.16) 2 ˜ ˙ −z + ω ¯ 2 − iΓ z Q(z) = M Q(0) − izQ(0) , (10.3.24) and, after transformation into the time domain for t > 0, we have the following equation of motion for Q(t):

2 d d 2 Q(t) = 0 . (10.3.25) +ω ¯ +Γ dt2 dt The coupling to the bath of oscillators leads to a frictional term and to irreversible damped motion. For example, let the initial values be Q(0) = 0, ˙ = 0) = Q(0) ˙ Q(t (for the lattice oscillators, we have already set qn (0) = q˙n (0) = 0); then from Eq. (10.3.24) it follows that ∞ ˙ dω e−izt Q(0) Θ(t) Q(t) = (10.3.26) 2 2 ¯ − iΓ z −∞ 2π −z + ω

490

10. Irreversibility and the Approach to Equilibrium

and, using the theorem of residues, Q(t) = e−Γ t/2

sin ω0 t ˙ Q(0) , ω0

(10.3.27)

9 2 ¯ 2 − Γ4 . with ω0 = ω The conditions for the derivation of the irreversible equation of motion (10.3.25) were: N 15 a) A limitation to times t ωmax . This implies practically no limitation for large N , since the exponential decay is much more rapid. b) The separation into macroscopic variables ≡ massive oscillator (of mass M ) and microscopic variables ≡ lattice oscillators (of mass m) leads, m owing to M 1, to a separation of time scales

Ω ωmax , Γ ωmax . The time scales of the macroscopic variables are 1/Ω, 1/Γ . The irreversibility (exponential damping) arises in going to the limit N → ∞. In order to obtain irreversibility even at arbitrarily long times, the limit N → ∞ must ﬁrst be taken. 10.3.2 Microscopic Time-Reversible and Macroscopic Irreversible Equations of Motion, Hydrodynamics The derivation of hydrodynamic equations of motion (Appendix H.) directly from the microscopic equations is based on the following elements: (i) The point of departure is represented by the equations of motion for the conserved quantities and the equations of motion for the inﬁnitely many nonconserved quantities. (ii) An important precondition is the separation of time scales ck ωn.c., i.e. the characteristic frequencies of the conserved quantities ck are much slower than the typical frequencies of the nonconserved quantities ωn.c., analogous to the ωλ (λ > 5) in the Boltzmann equation, Sect. 9.4.4. This permits the elimination of the rapid variables. In the analytic treatment in Appendix H., one starts from the equations of motion for the so called Kubo relaxation function φ and obtains equations of motion for the relaxation functions of the conserved quantities. From the oneto-one correspondence of equations of motion for φ and the time-dependent expectation values of operators under the inﬂuence of a perturbation, the hydrodynamic equations for the conserved quantities are obtained. The remaining variables express themselves in the form of damping terms, which can be expressed by Kubo formulas. 15

These times, albeit long, are much shorter than the Poincar´e recurrence time.

10.4 The Master Equation and Irreversibility in Quantum Mechanics

491

∗

10.4 The Master Equation and Irreversibility in Quantum Mechanics16 We consider an isolated system and its density matrix at the time t, with probabilities wi (t) wi (t) |i i| . (10.4.1) "(t) = i

The states |i are eigenstates of the Hamiltonian H0 . We let the quantum numbers i represent the energy Ei and a series of additional quantum numbers νi . A perturbation V also acts on the system or within it and causes transitions between the states; thus the overall Hamiltonian is H = H0 + V .

(10.4.2)

For example, in a nearly ideal gas, H0 could be the kinetic energy and V the interaction which results from collisions of the atoms. We next consider the time development of " on the basis of (10.4.1) and denote the timedevelopment operator by U (τ ). After the time τ the density matrix has the form "(t + τ ) = wi (t)U (τ ) |i i| U † (τ ) i

=

i

=

(10.4.3)

j,k

i

wi (t) |j j| U (τ ) |i i| U † (τ ) |k k| ∗ wi (t) |j k| Uji (τ ) Uki (τ ) ,

j,k

where the matrix elements Uji (τ ) ≡ j| U (τ ) |i

(10.4.4)

have been introduced. We assume that the system, even though it is practically isolated, is in fact subject to a phase averaging at each instant as a result of weak contacts to other macroscopic systems. This corresponds to taking the trace over other, unobserved degrees of freedom which are coupled to the system17 . Then the density matrix (10.4.3) is transformed to 16 17

W. Pauli, Sommerfeld Festschrift, S. Hirzel, Leipzig, 1928, p. 30. If for example every state |j of the system is connected with a state |2, j of these other macroscopic degrees of freedom, so that the contributions to the total density matrix are of the form |2, j |j k| 2, k| , then taking the trace over 2 leads to the diagonal form |j j|. This stochastic nature, which is introduced through contact to the system’s surroundings, is the decisive and subtle step in the derivation of the master equation. Cf. N. G. van Kampen, Physica 20, 603 (1954), and Fortschritte der Physik 4, 405 (1956).

492

10. Irreversibility and the Approach to Equilibrium

i

∗ wi (t) |j j| Uji (τ )Uji (τ ) .

(10.4.5)

j

Comparison with (10.4.1) shows that the probability for the state |j at the time t + τ is thus wj (t + τ ) = wi (t)|Uji (τ )|2 , i

and the change in the probability is wi (t) − wj (t) |Uji (τ )|2 , wj (t + τ ) − wj (t) =

(10.4.6)

i

where we have used i |Uji (τ )|2 = 1. On the right-hand side, the term i = j vanishes. We thus require only the nondiagonal elements of Uij (τ ), for which we can use the Golden Rule18 : 1 |Uji (τ )| = 2

2

sin ωij τ /2 ωij /2

2 | j| V |i |2 = τ

2π δ(Ei − Ej )| j| V |i |2 (10.4.7)

with ωij = (Ei − Ej )/. The limit of validity of the Golden Rule is ∆E 2π τ δε, where ∆E is the width of the energy distribution of the states and δε is the spacing of the energy levels. From (10.4.6) and (10.4.7), it follows that 2π dwj (t) = δ(Ei − Ej )| j| V |i |2 . wi (t) − wj (t) dt i As already mentioned at the beginning of this section, the index i ≡ (Ei , νi ) includes the quantum numbers of the energy and the νi , the large number of all remaining quantum numbers. The sum over the energy eigenvalues on the right-hand side can be replaced by an integral with the density of states "(Ei ) according to · · · = dEi "(Ei ) · · · Ei

so that, making use of the δ-function, we obtain: dwEj νj (t) 2π = (wEj ,νi − wEj ,νj ) "(Ej )| Ej , νj | V |Ej , νi |2 . (10.4.8) dt ν i

18

QM I, Eq. (16.36)

10.4 The Master Equation and Irreversibility in Quantum Mechanics

493

With the coeﬃcients λEj ,νj ;νi =

2π "(Ej )| Ej , νj | V |Ej , νi |2 ,

(10.4.9)

Pauli’s master equation follows: dwEj νj (t) = λEj ,νj ;νi wEj ,νi (t) − wEj ,νj (t) . dt ν

(10.4.10)

i

This equation has the general structure (Wn n pn − Wn n pn ) , p˙ n =

(10.4.11)

n

where the transition rates Wn n = Wn n obey the so called detailed balance condition19 eq Wn n peq n = Wn n pn

(10.4.12)

eq for the microcanonical ensemble, peq n = pn for all n and n . One can show in general that Eq. (10.4.11) is irreversible and that the entropy S=− pn log pn (10.4.13) n

increases. With (10.4.11), we have (pn log pn ) Wn n (pn − pn ) S˙ = − n, n

=

Wn n pn (pn log pn ) − (pn log pn ) .

n,n

By permutation of the summation indices n and n and using the symmetry relation Wn n = Wn n , we obtain 1 Wn n (pn − pn ) (pn log pn ) − (pn log pn ) > 0 , (10.4.14) S˙ = 2 n,n

where the inequality follows from the convexity of x log x (Fig. 10.4). The entropy continues to increase until pn = pn for all n and n . Here we assume that all the n and n are connected via a chain of matrix elements. The isolated system described by the master equation (10.4.10) approaches the microcanonical equilibrium.

19

See QM II, following Eq. (4.2.17).

494

10. Irreversibility and the Approach to Equilibrium

Fig. 10.4. The function f (x) = x log x is convex, (x −x)(f (x )−f (x)) > 0

10.5 Probability and Phase-Space Volume ∗

10.5.1 Probabilities and the Time Interval of Large Fluctuations

In the framework of equilibrium statistical mechanics one can calculate the probability that the system spontaneously takes on a constraint. In the context of the Gay-Lussac experiment, we found the probability that a system with a ﬁxed particle number N with a total volume V would be found only within the subvolume V1 (Eq. (3.5.5)): W (E, V1 ) = e−(S(E,V )−S(E,V1 ))/k .

(10.5.1)

For an ideal gas20 , the entropy is S(E, V ) = kN (log NVλ3 + E=

3 2 N kT ,

T

5 2 ).

Since

λT remains unchanged on expansion and it follows that

1 for low W (E, V1 ) = e−N log V1 . This gives log VV1 = log V −(VV −V1 ) ≈ V −V V V compressions. At higher compressions, V1 0. The phase space of these states is the same size as the phase space at the time t = 0; it is thus considerably smaller than that of all the states which represent the macrostate at times t > 0. The state Tt X contains complex correlations. The typical microstates of M (t) lack these correlations. They become apparent upon time reversal. In the forward direction of time, in contrast, the future of such atypical microstates is just the same as that of the typical states.

502

10. Irreversibility and the Approach to Equilibrium

Fig. 10.6. The entropy as a function of time in the expansion of a computer gas consisting of 864 atoms. In the initial stages, all the curves lie on top of one another. (1) The unperturbed expansion of V1 to V (solid curve). (2) Time reversal at t = 94.4 (dashed curve), the system returns to its initial state and the entropy to its initial value. (3) A perturbation # at t = 18.88 and time reversal at t = 30.68. The system approaches its initial state closely (dotted curve). (4) A perturbation # at t = 59 and time reversal at t = 70.8 (chain curve). Only for a short time after the time reversal does the entropy decrease; it then increases towards its equilibrium value.32

together within the original subvolume30 . It is apparent that the initial state which we deﬁned at the beginning leads in the course of time to a state which is not typical of a gas with the density shown in Fig. 10.1 c) and a Maxwell distribution. A typical microstate for such a gas would never compress itself into a subvolume after a time reversal. States which develop in such a correlated manner and which are not typical will be termed quasi-equilibrium states 31 , also called local quasi-equilibrium states during the intermediate stages of the time development. Quasi-equilibrium states have the property that their macroscopic appearance is not invariant under time reversal. Although these quasi-equilibrium states of isolated systems doubtless exist and their time-reversed counterparts can be visualized in the computer experiment, the latter would seem to have no signiﬁcance in reality. Thus, why was Boltzmann nevertheless correct in his statement that the entropy SB always increases monotonically apart from small ﬂuctuations? 30

31

32

The associated “coarse-grained” Boltzmann entropy (10.7.1) decreases following the time reversal, curve (2) in Fig. 10.6. A time dependence of this type is not described by the Boltzmann equation and is also never observed in Nature. J. M. Blatt, An Alternative Approach to the Ergodic Pr

Franz Schwabl

Statistical Mechanics Translated by William Brewer

Second Edition With 202 Figures, 26 Tables, and 195 Problems

123

Professor Dr. Franz Schwabl Physik-Department Technische Universit¨at M¨ unchen James-Franck-Strasse 85747 Garching, Germany E-mail: [email protected]

Translator: Professor William Brewer, PhD Fachbereich Physik Freie Universität Berlin Arnimallee 14 14195 Berlin, Germany E-mail: [email protected]

Title of the original German edition: Statistische Mechanik (Springer-Lehrbuch) 3rd ed. ISBN 3-540-31095-9 © Springer-Verlag Berlin Heidelberg 2006

Library of Congress Control Number: 2006925304

ISBN-10 3-540-32343-0 2nd ed. Springer Berlin Heidelberg New York ISBN-13 978-3-540-32343-3 2nd ed. Springer Berlin Heidelberg New York ISBN 3-540-43163-2 1st ed. Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable for prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2002, 2006 Printed in Germany The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typesetting: W. Brewer and LE-TEX Jelonek, Schmidt & Vöckler GbR using a Springer TEX-macro package Production: LE-TEX Jelonek, Schmidt & Vöckler GbR, Leipzig Cover design: eStudio Calamar S. L., F. Steinen-Broo, Pau/Girona, Spain Printed on acid-free paper

56/3100/YL

543210

A theory is all the more impressive the simpler its premises, the greater the variety of phenomena it describes, and the broader its area of application. This is the reason for the profound impression made on me by classical thermodynamics. It is the only general physical theory of which I am convinced that, within its regime of applicability, it will never be overturned (this is for the special attention of the skeptics in principle). Albert Einstein

To my daughter Birgitta

Preface to the Second Edition

In this new edition, supplements, additional explanations and cross references have been added in numerous places, including additional problems and revised formulations of the problems. Figures have been redrawn and the layout improved. In all these additions I have pursued the goal of not changing the compact character of the book. I wish to thank Prof. W. Brewer for integrating these changes into his competent translation of the ﬁrst edition. I am grateful to all the colleagues and students who have made suggestions to improve the book as well as to the publisher, Dr. Thorsten Schneider and Mrs. J. Lenz for their excellent cooperation.

Munich, December 2005

F. Schwabl

Preface to the First Edition

This book deals with statistical mechanics. Its goal is to give a deductive presentation of the statistical mechanics of equilibrium systems based on a single hypothesis – the form of the microcanonical density matrix – as well as to treat the most important aspects of non-equilibrium phenomena. Beyond the fundamentals, the attempt is made here to demonstrate the breadth and variety of the applications of statistical mechanics. Modern areas such as renormalization group theory, percolation, stochastic equations of motion and their applications in critical dynamics are treated. A compact presentation was preferred wherever possible; it however requires no additional aids except for a knowledge of quantum mechanics. The material is made as understandable as possible by the inclusion of all the mathematical steps and a complete and detailed presentation of all intermediate calculations. At the end of each chapter, a series of problems is provided. Subsections which can be skipped over in a ﬁrst reading are marked with an asterisk; subsidiary calculations and remarks which are not essential for comprehension of the material are shown in small print. Where it seems helpful, literature citations are given; these are by no means complete, but should be seen as an incentive to further reading. A list of relevant textbooks is given at the end of each of the more advanced chapters. In the ﬁrst chapter, the fundamental concepts of probability theory and the properties of distribution functions and density matrices are presented. In Chapter 2, the microcanonical ensemble and, building upon it, basic quantities such as entropy, pressure and temperature are introduced. Following this, the density matrices for the canonical and the grand canonical ensemble are derived. The third chapter is devoted to thermodynamics. Here, the usual material (thermodynamic potentials, the laws of thermodynamics, cyclic processes, etc.) are treated, with special attention given to the theory of phase transitions, to mixtures and to border areas related to physical chemistry. Chapter 4 deals with the statistical mechanics of ideal quantum systems, including the Bose–Einstein condensation, the radiation ﬁeld, and superﬂuids. In Chapter 5, real gases and liquids are treated (internal degrees of freedom, the van der Waals equation, mixtures). Chapter 6 is devoted to the subject of magnetism, including magnetic phase transitions. Furthermore, related phenomena such as the elasticity of rubber are presented. Chapter 7

X

Preface

deals with the theory of phase transitions and critical phenomena; following a general overview, the fundamentals of renormalization group theory are given. In addition, the Ginzburg–Landau theory is introduced, and percolation is discussed (as a topic related to critical phenomena). The remaining three chapters deal with non-equilibrium processes: Brownian motion, the Langevin and Fokker–Planck equations and their applications as well as the theory of the Boltzmann equation and from it, the H-Theorem and hydrodynamic equations. In the ﬁnal chapter, dealing with the topic of irreversiblility, fundamental considerations of how it occurs and of the transition to equilibrium are developed. In appendices, among other topics the Third Law and a derivation of the classical distribution function starting from quantum statistics are presented, along with the microscopic derivation of the hydrodynamic equations. The book is recommended for students of physics and related areas from the 5th or 6th semester on. Parts of it may also be of use to teachers. It is suggested that students at ﬁrst skip over the sections marked with asterisks or shown in small print, and thereby concentrate their attention on the essential core material. This book evolved out of lecture courses given numerous times by the author at the Johannes Kepler Universit¨ at in Linz (Austria) and at the Technische Universit¨at in Munich (Germany). Many coworkers have contributed to the production and correction of the manuscript: I. Wefers, E. J¨ org-M¨ uller, M. Hummel, A. Vilfan, J. Wilhelm, K. Schenk, S. Clar, P. Maier, B. Kaufmann, M. Bulenda, H. Schinz, and A. Wonhas. W. Gasser read the whole manuscript several times and made suggestions for corrections. Advice and suggestions from my former coworkers E. Frey and U. C. T¨auber were likewise quite valuable. I wish to thank Prof. W. D. Brewer for his faithful translation of the text. I would like to express my sincere gratitude to all of them, along with those of my other associates who oﬀered valuable assistance, as well as to Dr. H. J. K¨ olsch, representing the Springer-Verlag.

Munich, October 2002

F. Schwabl

Table of Contents

1.

2.

Basic Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 A Brief Excursion into Probability Theory . . . . . . . . . . . . . . . . . 1.2.1 Probability Density and Characteristic Functions . . . . . 1.2.2 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . 1.3 Ensembles in Classical Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Phase Space and Distribution Functions . . . . . . . . . . . . . 1.3.2 The Liouville Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Quantum Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 The Density Matrix for Pure and Mixed Ensembles . . . 1.4.2 The Von Neumann Equation . . . . . . . . . . . . . . . . . . . . . . . ∗ 1.5 Additional Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 1.5.1 The Binomial and the Poisson Distributions . . . . . . . . . ∗ 1.5.2 Mixed Ensembles and the Density Matrix of Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Equilibrium Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Microcanonical Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Microcanonical Distribution Functions and Density Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 The Classical Ideal Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 2.2.3 Quantum-mechanical Harmonic Oscillators and Spin Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 General Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 An Extremal Property of the Entropy . . . . . . . . . . . . . . . 2.3.3 Entropy of the Microcanonical Ensemble . . . . . . . . . . . . 2.4 Temperature and Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Systems in Contact: the Energy Distribution Function, Deﬁnition of the Temperature . . . . . . . . . . . . . . . . . . . . . .

1 1 4 4 7 9 9 11 14 14 15 16 16 19 21

25 25 26 26 30 33 35 35 36 37 38 38

XII

Table of Contents

2.4.2 On the Widths of the Distribution Functions of Macroscopic Quantities . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 External Parameters: Pressure . . . . . . . . . . . . . . . . . . . . . 2.5 Properties of Some Non-interacting Systems . . . . . . . . . . . . . . . 2.5.1 The Ideal Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 2.5.2 Non-interacting Quantum Mechanical Harmonic Oscillators and Spins . . . . . . . . . . . . . . . . . . . . 2.6 The Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 The Density Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Examples: the Maxwell Distribution and the Barometric Pressure Formula . . . . . . . . . . . . . . . 2.6.3 The Entropy of the Canonical Ensemble and Its Extremal Values . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.4 The Virial Theorem and the Equipartition Theorem . . 2.6.5 Thermodynamic Quantities in the Canonical Ensemble 2.6.6 Additional Properties of the Entropy . . . . . . . . . . . . . . . 2.7 The Grand Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Systems with Particle Exchange . . . . . . . . . . . . . . . . . . . . 2.7.2 The Grand Canonical Density Matrix . . . . . . . . . . . . . . . 2.7.3 Thermodynamic Quantities . . . . . . . . . . . . . . . . . . . . . . . . 2.7.4 The Grand Partition Function for the Classical Ideal Gas . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 2.7.5 The Grand Canonical Density Matrix in Second Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.

Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Potentials and Laws of Equilibrium Thermodynamics . . . . . . . 3.1.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 The Legendre Transformation . . . . . . . . . . . . . . . . . . . . . . 3.1.3 The Gibbs–Duhem Relation in Homogeneous Systems . 3.2 Derivatives of Thermodynamic Quantities . . . . . . . . . . . . . . . . . 3.2.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Integrability and the Maxwell Relations . . . . . . . . . . . . . 3.2.3 Jacobians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Fluctuations and Thermodynamic Inequalities . . . . . . . . . . . . . 3.3.1 Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Absolute Temperature and Empirical Temperatures . . . . . . . . . 3.5 Thermodynamic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Thermodynamic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 The Irreversible Expansion of a Gas; the Gay-Lussac Experiment . . . . . . . . . . . . . . . . . . . . . . . . 3.5.3 The Statistical Foundation of Irreversibility . . . . . . . . . .

41 42 46 46 48 50 50 53 54 54 58 60 63 63 64 65 67 69 70 75 75 75 79 81 82 82 84 87 88 89 89 90 91 92 93 95 97

Table of Contents

3.5.4 Reversible Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.5 The Adiabatic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 The First and Second Laws of Thermodynamics . . . . . . . . . . . . 3.6.1 The First and the Second Law for Reversible and Irreversible Processes . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 3.6.2 Historical Formulations of the Laws of Thermodynamics and other Remarks . . 3.6.3 Examples and Supplements to the Second Law . . . . . . . 3.6.4 Extremal Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 3.6.5 Thermodynamic Inequalities Derived from Maximization of the Entropy . . . . . . . . . . 3.7 Cyclic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 General Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.2 The Carnot Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.3 General Cyclic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Phases of Single-Component Systems . . . . . . . . . . . . . . . . . . . . . 3.8.1 Phase-Boundary Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.2 The Clausius–Clapeyron Equation . . . . . . . . . . . . . . . . . . 3.8.3 The Convexity of the Free Energy and the Concavity of the Free Enthalpy (Gibbs’ Free Energy) . . . . . . . . . . . 3.8.4 The Triple Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9 Equilibrium in Multicomponent Systems . . . . . . . . . . . . . . . . . . 3.9.1 Generalization of the Thermodynamic Potentials . . . . . 3.9.2 Gibbs’ Phase Rule and Phase Equilibrium . . . . . . . . . . . 3.9.3 Chemical Reactions, Thermodynamic Equilibrium and the Law of Mass Action . . . . . . . . . . . . . . . . . . . . . . . ∗ 3.9.4 Vapor-pressure Increase by Other Gases and by Surface Tension . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.

Ideal Quantum Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 The Grand Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The Classical Limit z = eµ/kT 1 . . . . . . . . . . . . . . . . . . . . . . . . 4.3 The Nearly-degenerate Ideal Fermi Gas . . . . . . . . . . . . . . . . . . . 4.3.1 Ground State, T = 0 (Degeneracy) . . . . . . . . . . . . . . . . . 4.3.2 The Limit of Complete Degeneracy . . . . . . . . . . . . . . . . . ∗ 4.3.3 Real Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 The Bose–Einstein Condensation . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 The Photon Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Properties of Photons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.2 The Canonical Partition Function . . . . . . . . . . . . . . . . . . 4.5.3 Planck’s Radiation Law . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 4.5.4 Supplemental Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 4.5.5 Fluctuations in the Particle Number of Fermions and Bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

XIII

98 102 103 103 107 109 120 123 125 125 126 128 130 130 134 139 141 144 144 146 150 156 160 169 169 175 176 177 178 185 190 197 197 199 200 204 205

XIV

Table of Contents

4.6 Phonons in Solids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 The Harmonic Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Thermodynamic Properties . . . . . . . . . . . . . . . . . . . . . . . . ∗ 4.6.3 Anharmonic Eﬀects, the Mie–Gr¨ uneisen Equation of State . . . . . . . . . . . . . . . 4.7 Phonons und Rotons in He II . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 The Excitations (Quasiparticles) of He II . . . . . . . . . . . . 4.7.2 Thermal Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 4.7.3 Superﬂuidity and the Two-Fluid Model . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.

6.

Real Gases, Liquids, and Solutions . . . . . . . . . . . . . . . . . . . . . . . . 5.1 The Ideal Molecular Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 The Hamiltonian and the Partition Function . . . . . . . . . 5.1.2 The Rotational Contribution . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 The Vibrational Contribution . . . . . . . . . . . . . . . . . . . . . . ∗ 5.1.4 The Inﬂuence of the Nuclear Spin . . . . . . . . . . . . . . . . . . ∗ 5.2 Mixtures of Ideal Molecular Gases . . . . . . . . . . . . . . . . . . . . . . . . 5.3 The Virial Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 The Classical Approximation for the Second Virial Coeﬃcient . . . . . . . . . . . . . . . . . . . . 5.3.3 Quantum Corrections to the Virial Coeﬃcients . . . . . . . 5.4 The Van der Waals Equation of State . . . . . . . . . . . . . . . . . . . . . 5.4.1 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 The Maxwell Construction . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3 The Law of Corresponding States . . . . . . . . . . . . . . . . . . 5.4.4 The Vicinity of the Critical Point . . . . . . . . . . . . . . . . . . . 5.5 Dilute Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 The Partition Function and the Chemical Potentials . . 5.5.2 Osmotic Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 5.5.3 Solutions of Hydrogen in Metals (Nb, Pd,...) . . . . . . . . . 5.5.4 Freezing-Point Depression, Boiling-Point Elevation, and Vapor-Pressure Reduction . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Magnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 The Density Matrix and Thermodynamics . . . . . . . . . . . . . . . . . 6.1.1 The Hamiltonian and the Canonical Density Matrix . . 6.1.2 Thermodynamic Relations . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.3 Supplementary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 The Diamagnetism of Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 The Paramagnetism of Non-coupled Magnetic Moments . . . . . 6.4 Pauli Spin Paramagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

206 206 209 211 213 213 215 217 221 225 225 225 227 230 232 234 236 236 238 241 242 242 247 251 251 257 257 261 262 263 266 269 269 269 273 276 278 280 284

Table of Contents

6.5 Ferromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 The Exchange Interaction . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.2 The Molecular Field Approximation for the Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.3 Correlation Functions and Susceptibility . . . . . . . . . . . . . 6.5.4 The Ornstein–Zernike Correlation Function . . . . . . . . . . ∗ 6.5.5 Continuum Representation . . . . . . . . . . . . . . . . . . . . . . . . ∗ 6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.1 The Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6.2 Thermodynamics and Magnetostatics . . . . . . . . . . . . . . . 6.6.3 Statistical–Mechanical Justiﬁcation . . . . . . . . . . . . . . . . . 6.6.4 Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Applications to Related Phenomena . . . . . . . . . . . . . . . . . . . . . . 6.7.1 Polymers and Rubber-like Elasticity . . . . . . . . . . . . . . . . 6.7.2 Negative Temperatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 6.7.3 The Melting Curve of 3 He . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.

Phase Transitions, Renormalization Group Theory, and Percolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Phase Transitions and Critical Phenomena . . . . . . . . . . . . . . . . . 7.1.1 Symmetry Breaking, the Ehrenfest Classiﬁcation . . . . . ∗ 7.1.2 Examples of Phase Transitions and Analogies . . . . . . . . 7.1.3 Universality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 The Static Scaling Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Thermodynamic Quantities and Critical Exponents . . . 7.2.2 The Scaling Hypothesis for the Correlation Function . . 7.3 The Renormalization Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 The One-Dimensional Ising Model, Decimation Transformation . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3 The Two-Dimensional Ising Model . . . . . . . . . . . . . . . . . . 7.3.4 Scaling Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 7.3.5 General RG Transformations in Real Space . . . . . . . . . . ∗ 7.4 The Ginzburg–Landau Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Ginzburg–Landau Functionals . . . . . . . . . . . . . . . . . . . . . 7.4.2 The Ginzburg–Landau Approximation . . . . . . . . . . . . . . 7.4.3 Fluctuations in the Gaussian Approximation . . . . . . . . . 7.4.4 Continuous Symmetry and Phase Transitions of First Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 7.4.5 The Momentum-Shell Renormalization Group . . . . . . . . ∗ 7.5 Percolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.1 The Phenomenon of Percolation . . . . . . . . . . . . . . . . . . . . 7.5.2 Theoretical Description of Percolation . . . . . . . . . . . . . . .

XV

287 287 289 300 301 305 307 307 308 312 316 317 317 320 323 325 331 331 331 332 338 339 339 343 345 345 346 349 356 359 361 361 364 366 373 380 387 387 391

XVI

Table of Contents

7.5.3 Percolation in One Dimension . . . . . . . . . . . . . . . . . . . . . . 7.5.4 The Bethe Lattice (Cayley Tree) . . . . . . . . . . . . . . . . . . . 7.5.5 General Scaling Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.6 Real-Space Renormalization Group Theory . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.

9.

Brownian Motion, Equations of Motion and the Fokker–Planck Equations . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Langevin Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 The Free Langevin Equation . . . . . . . . . . . . . . . . . . . . . . . 8.1.2 The Langevin Equation in a Force Field . . . . . . . . . . . . . 8.2 The Derivation of the Fokker–Planck Equation from the Langevin Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1 The Fokker–Planck Equation for the Langevin Equation (8.1.1) . . . . . . . . . . . . . . . . . . 8.2.2 Derivation of the Smoluchowski Equation for the Overdamped Langevin Equation, (8.1.23) . . . . . 8.2.3 The Fokker–Planck Equation for the Langevin Equation (8.1.22b) . . . . . . . . . . . . . . . . 8.3 Examples and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.1 Integration of the Fokker–Planck Equation (8.2.6) . . . . 8.3.2 Chemical Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3.3 Critical Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 8.3.4 The Smoluchowski Equation and Supersymmetric Quantum Mechanics . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Boltzmann Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Derivation of the Boltzmann Equation . . . . . . . . . . . . . . . . . . . . 9.3 Consequences of the Boltzmann Equation . . . . . . . . . . . . . . . . . 9.3.1 The H-Theorem and Irreversibility . . . . . . . . . . . . . . . . . ∗ 9.3.2 Behavior of the Boltzmann Equation under Time Reversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.3 Collision Invariants and the Local Maxwell Distribution . . . . . . . . . . . . . . . . . 9.3.4 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.5 The Hydrodynamic Equations in Local Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ 9.4 The Linearized Boltzmann Equation . . . . . . . . . . . . . . . . . . . . . . 9.4.1 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.2 The Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.3 Eigenfunctions of L and the Expansion of the Solutions of the Boltzmann Equation . . . . . . . . . .

392 393 398 400 404

409 409 409 414 416 416 418 420 420 420 422 425 429 432 437 437 438 443 443 446 447 449 451 455 455 457 458

Table of Contents

XVII

9.4.4 The Hydrodynamic Limit . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.5 Solutions of the Hydrodynamic Equations . . . . . . . . . . . ∗ 9.5 Supplementary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.1 Relaxation-Time Approximation . . . . . . . . . . . . . . . . . . . 9.5.2 Calculation of W (v1 , v2 ; v1 , v2 ) . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

460 466 468 468 469 476

10. Irreversibility and the Approach to Equilibrium . . . . . . . . . . 10.1 Preliminary Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Recurrence Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 The Origin of Irreversible Macroscopic Equations of Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 A Microscopic Model for Brownian Motion . . . . . . . . . . 10.3.2 Microscopic Time-Reversible and Macroscopic Irreversible Equations of Motion, Hydrodynamics . . . . . ∗ 10.4 The Master Equation and Irreversibility in Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Probability and Phase-Space Volume . . . . . . . . . . . . . . . . . . . . . . ∗ 10.5.1 Probabilities and the Time Interval of Large Fluctuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5.2 The Ergodic Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6 The Gibbs and the Boltzmann Entropies and their Time Dependences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6.1 The Time Derivative of Gibbs’ Entropy . . . . . . . . . . . . . 10.6.2 Boltzmann’s Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.7 Irreversibility and Time Reversal . . . . . . . . . . . . . . . . . . . . . . . . . 10.7.1 The Expansion of a Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.7.2 Description of the Expansion Experiment in µ-Space . . 10.7.3 The Inﬂuence of External Perturbations on the Trajectories of the Particles . . . . . . . . . . . . . . . . . ∗ 10.8 Entropy Death or Ordered Structures? . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

479 479 481

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Nernst’s Theorem (Third Law) . . . . . . . . . . . . . . . . . . . . . . . . . . . A.1 Preliminary Remarks on the Historical Development of Nernst’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Nernst’s Theorem and its Thermodynamic Consequences . . . . . . . . . . . . . . A.3 Residual Entropy, Metastability, etc. . . . . . . . . . . . . . . . . B. The Classical Limit and Quantum Corrections . . . . . . . . . . . . . B.1 The Classical Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

484 484 490 491 494 494 497 498 498 498 500 500 505 506 507 509 513 513 513 514 516 521 521

XVIII Table of Contents

B.2 B.3

C. D. E. F. G. H.

I.

Calculation of the Quantum-Mechanical Corrections . . Quantum Corrections to the Second Virial Coeﬃcient B(T ) . . . . . . . . . . . . . . . The Perturbation Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Riemann ζ-Function and the Bernoulli Numbers . . . . . . . . Derivation of the Ginzburg–Landau Functional . . . . . . . . . . . . . The Transfer Matrix Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Integrals Containing the Maxwell Distribution . . . . . . . . . . . . . Hydrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H.1 Hydrodynamic Equations, Phenomenological Discussion . . . . . . . . . . . . . . . . . . . . . . H.2 The Kubo Relaxation Function . . . . . . . . . . . . . . . . . . . . . H.3 The Microscopic Derivation of the Hydrodynamic Equations . . . . . . . . . . . . . . . . . . . . Units and Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

526 531 536 537 538 545 547 548 549 550 552 557

Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565

1. Basic Principles

1.1 Introduction Statistical mechanics deals with the physical properties of systems which consist of a large number of particles, i.e. many-body systems, and it is based on the microscopic laws of nature. Examples of such many-body systems are gases, liquids, solids in their various forms (crystalline, amorphous), liquid crystals, biological systems, stellar matter, the radiation ﬁeld, etc. Among their physical properties which are of interest are equilibrium properties (speciﬁc heat, thermal expansion, modulus of elasticity, magnetic susceptibility, etc.) and transport properties (thermal conductivity, electrical conductivity, etc.). Long before it was provided with a solid basis by statistical mechanics, thermodynamics had been developed; it yields general relations between the macroscopic parameters of a system. The First Law of Thermodynamics was formulated by Robert Mayer in 1842. It states that the energy content of a body consists of the sum of the work performed on it and the heat which is put into it: dE = δQ + δW .

(1.1.1)

The fact that heat is a form of energy, or more precisely, that energy can be transferred to a body in the form of heat, was tested experimentally by Joule in the years 1843–1849 (experiments with friction). The Second Law was formulated by Clausius and by Lord Kelvin (W. Thomson1 ) in 1850. It is based on the fact that a particular state of a thermodynamic system can be reached through diﬀerent ways of dividing up the energy transferred to it into work and heat, i.e. heat is not a “state variable” (a state variable is a physical quantity which is determined by the state of the system; this concept will be given a mathematically precise deﬁnition later). The essential new information in the Second Law was that there exists a state variable S, the entropy, which for reversible changes is related to the quantity of heat transferred by the equation 1

Born W. Thomson; the additional name was assumed later in connection with his knighthood, granted in recognition of his scientiﬁc achievements.

2

1. Basic Principles

δQ = T dS ,

(1.1.2)

while for irreversible processes, δQ < T dS holds. The Second Law is identical with the statement that a perpetual motion machine of the second kind is impossible to construct (this would be a periodically operating machine which performs work by only extracting heat from a single heat bath). The atomistic basis of thermodynamics was ﬁrst recognized in the kinetic theory of dilute gases. The velocity distribution derived by Maxwell (1831– 1879) permits the derivation of the caloric and thermal equation of state of ideal gases. Boltzmann (1844–1906) wrote the basic transport equation which bears his name in the year 1874. From it, he derived the entropy increase (H theorem) on approaching equilibrium. Furthermore, Boltzmann realized that the entropy depends on the number of states W (E, V, . . .) which are compatible with the macroscopic values of the energy E, the volume V, . . . as given by the relation S ∝ log W (E, V, . . .) .

(1.1.3)

It is notable that the atomistic foundations of the theory of gases were laid at a time when the atomic structure of matter had not yet been demonstrated experimentally; it was even regarded with considerable scepticism by well-known physicists such as E. Mach (1828–1916), who favored continuum theories. The description of macroscopic systems in terms of statistical ensembles was justiﬁed by Boltzmann on the basis of the ergodic hypothesis. Fundamental contributions to thermodynamics and to the statistical theory of macroscopic systems were made by J. Gibbs (1839–1903) in the years 1870–1900. Only after the formulation of quantum mechanics (1925) did the correct theory for the atomic regime become available. To distinguish it from classical statistical mechanics, the statistical mechanics based on the quantum theory is called quantum statistics. Many phenomena such as the electronic properties of solids, superconductivity, superﬂuidity, or magnetism can be explained only by applying quantum statistics. Even today, statistical mechanics still belongs among the most active areas of theoretical physics: the theory of phase transitions, the theory of liquids, disordered solids, polymers, membranes, biological systems, granular matter, surfaces, interfaces, the theory of irreversible processes, systems far from equilibrium, nonlinear processes, structure formation in open systems, biological processes, and at present still magnetism and superconductivity are ﬁelds of active interest. Following these remarks about the problems treated in statistical mechanics and its historical development, we now indicate some characteristic problems which play a role in the theory of macroscopic systems. Conventional macroscopic systems such as gases, liquids and solids at room temperature consist of 1019 –1023 particles per cm3 . The number of quantum-mechanical eigenstates naturally increases as the number of particles. As we shall see

1.1 Introduction

3

Fig. 1.1. Spacing of the energy levels for a large number of particles N .

later, the separation of the energy levels is of the order of e−N , i.e. the energy levels are so densely spaced that even the smallest perturbation can transfer the system from one state to another one which has practically the same energy. Should we now set ourselves the goal of calculating the motion of the 3N coordinates in classical physics, or the time dependence of the wavefunctions in quantum mechanics, in order to compute temporal averages from them? Both programs would be impossible to carry out and are furthermore unnecessary. One can solve neither Newton’s equations nor the Schr¨odinger equation for 1019 –1023 particles. And even if we had the solutions, we would not know all the coordinates and velocities or all the quantum numbers required to determine the initial values. Furthermore, the detailed time development plays no role for the macroscopic properties which are of interest. In addition, even the weakest interaction (external perturbation), which would always be present even with the best possible isolation of the system from its environment, would lead to a change in the microscopic state without aﬀecting the macroscopic properties. For the following discussion, we need to deﬁne two concepts. The microstate: it is deﬁned by the wavefunction of the system in quantum mechanics, or by all the coordinates and momenta of the system in classical physics. The macrostate: this is characterized by a few macroscopic quantities (energy, volume, . . .). From the preceding considerations it follows that the state of a macroscopic system must be described statistically. The fact that the system passes through a distribution of microstates during a measurement requires that we characterize the macrostate by giving the probabilities for the occurrence of particular microstates. The collection of all the microstates which represent a macrostate, weighted by their frequency of occurrence, is referred to as a statistical ensemble. Although the state of a macroscopic system is characterized by a statistical ensemble, the predictions of macroscopic quantities are precise. Their mean values and mean square deviations are both proportional to the number of particles N . The relative ﬂuctuations, i.e. the ratio of ﬂuctuations to mean values, tend towards zero in the thermodynamic limit (see (1.2.21c)).

4

1. Basic Principles

1.2 A Brief Excursion into Probability Theory At this point, we wish to collect a few basic mathematical deﬁnitions from probability theory, in order to derive the central limit theorem.2 1.2.1 Probability Density and Characteristic Functions We ﬁrst have to consider the meaning of the concept of a random variable. This refers to a quantity X which takes on values x depending upon the elements e of a “set of events” E. In each individual observation, the value of X is uncertain; instead, one knows only the probability for the occurrence of one of the possible results (events) from the set E. For example, in the case of an ideal die, the random variable is the number of spots, which can take on values between 1 and 6; each of these events has the probability 1/6. If we had precise knowledge of the initial position of the die and the forces acting on it during the throw, we could calculate the result from classical mechanics. Lacking such detailed information, we can make only the probability statement given above. Let e ∈ E be an event from the set E and Pe be its corresponding probability; then for a large number of attempts, N , the number of times Ne that the event e occurs is related to Pe by limN →∞ NNe = Pe . Let X be a random variable. If the values x which X can assume are continuously distributed, we deﬁne the probability density of the random variable to be w(x). This means that w(x)dx is the probability that X assumes a value in the interval [x, x + dx]. The total probability must be one, i.e. w(x) is normalized to one: +∞ dx w(x) = 1 . (1.2.1) −∞

Deﬁnition 1 : The mean value of X is deﬁned by +∞ dx w(x) x . X =

(1.2.2)

−∞

Now let F (X) be a function of the random variable X; one then calls F (X) a random function. Its mean value is deﬁned corresponding to (1.2.2) by3 F (X) = dx w(x)F (x) . (1.2.2 ) The powers of X have a particular importance: their mean values will be used to introduce the moments of the probability density. 2

3

See e.g.: W. Feller, An Introduction to Probability Theory and its Applications, Vol. I (Wiley, New York 1968). In the case that the limits of integration are not given, the integral is to be taken from −∞ to +∞. An analogous simpliﬁed notation will also be used for integrals over several variables.

1.2 A Brief Excursion into Probability Theory

5

Deﬁnition 2 : The nth moment of the probability density w(x) is deﬁned as µn = X n .

(1.2.3)

(The ﬁrst moment of w(x) is simply the mean value of X.) Deﬁnition 3 : The mean square deviation (or variance) is deﬁned by 2 2 2 (∆x) = X 2 − X = X − X .

(1.2.4)

Its square root is called the root-mean-square deviation or standard deviation. Deﬁnition 4 : Finally, we deﬁne the characteristic function: χ(k) = dx e−ikx w(x) ≡ e−ikX .

(1.2.5)

By taking its inverse Fourier transform, w(x) can be expressed in terms of χ(k): dk ikx w(x) = e χ(k) . (1.2.6) 2π Under the assumption that all the moments of the probability density w(x) exist, it follows from Eq. (1.2.5) that the characteristic function is χ(k) =

(−ik)n n

n!

X n .

(1.2.7)

If X has a discrete spectrum of values, i.e. the values ξ1 , ξ2 , . . . can occur with probabilities p1 , p2 , . . ., the probability density has the form w(x) = p1 δ(x − ξ1 ) + p2 δ(x − ξ2 ) + . . . .

(1.2.8)

Often, the probability density will have discrete and continuous regions. In the case of multidimensional systems (those with several components) X = (X1 , X2 , . . .), let x = (x1 , x2 , . . .) be the values taken on by X. Then the probability density (also called the joint probability density) is w(x) and it has the following signiﬁcance: w(x)dx ≡ w(x)dx1 dx2 . . . dxN is the probability of ﬁnding x in the hypercubic element x, x + dx. We will also use the term probability distribution or, for short, simply the distribution. Deﬁnition 5 : The mean value of a function F (X) of the random variables X is deﬁned by F (X) = dx w(x)F (x) . (1.2.9) Theorem: The probability density of F (X) A function F of the random variables X is itself a random variable, which can take on the values f corresponding to a probability density wF (f ). The

6

1. Basic Principles

probability density wF (f ) can be calculated from the probability density w(x). We assert that: wF (f ) = δ(F (X) − f ) .

(1.2.10)

Proof : We express the probability density wF (f ) in terms of its characteristic function dk ikf (−ik)n n e F . wF (f ) = 2π n! n If we insert F n = dx w(x)F (x)n , we ﬁnd dk ikf e dx w(x)e−ikF (x) wF (f ) = 2π

and, making use of the Fourier representation of the δ-function δ(y) = dk after iky , we ﬁnally obtain 2π e wF (f ) = dx w(x)δ(f − F (x)) = δ(F (X) − f ) , i.e. Eq. (1.2.10). Deﬁnition 6 : For multidimensional systems we deﬁne correlations Kij = (Xi − Xi )(Xj − Xj )

(1.2.11)

of the random variables Xi and Xj . These indicate to what extent ﬂuctuations (deviations from the mean value) of Xi and Xj are correlated. If the probability density has the form w(x) = wi (xi )w ({xk , k = i}) , where w ({xk , k = i}) does not depend on xi , then Kij = 0 for j = i, i.e. Xi and Xj are not correlated. In the special case w(x) = w1 (x1 ) · · · wN (xN ) , the stochastic variables X1 , . . . , XN are completely uncorrelated. Let Pn (x1 , . . . , xn−1 , xn ) be the probability density of the random variables X1 , . . . , Xn−1 , Xn . Then the probability density for a subset of these random variables is given by integration of Pn over the range of values of the remaining random variables; e.g. the probability density Pn−1 (x1 , . . . , xn−1 ) for the random variables X1 , . . . , Xn−1 is Pn−1 (x1 , . . . , xn−1 ) = dxn Pn (x1 , . . . , xn−1 , xn ) . Finally, we introduce the concept of conditional probability and the conditional probability density.

1.2 A Brief Excursion into Probability Theory

7

Deﬁnition 7: Let Pn (x1 , . . . , xn ) be the probability (density). The conditional probability (density) Pk|n−k (x1 , . . . , xk |xk+1 , . . . , xn ) is deﬁned as the probability (density) of the random variables x1 , . . . , xk , if the remaining variables xk+1 , . . . , xn have given values. We ﬁnd Pk|n−k (x1 , . . . , xk |xk+1 , . . . , xn ) = where Pn−k (xk+1 , . . . , xn ) =

Pn (x1 , . . . , xn ) , Pn−k (xk+1 , . . . , xn )

(1.2.12)

dx1 . . . dxk Pn (x1 , . . . , xn ) .

Note concerning conditional probability: formula (1.2.12) is usually introduced as a deﬁnition in the mathematical literature, but it can be deduced in the following way, if one identiﬁes the probabilities with statistical frequencies: Pn (x1 , . . . , xk , xk+1 , . . . , xn ) for ﬁxed xk+1 , . . . , xn determines the frequencies of the x1 , . . . , xk with given values of xk+1 , . . . , xn . The probability density which corresponds to these Rfrequencies is therefore proportional to Pn (x1 , . . . , xk , xk+1 , . . . , xn ). Since dx1 . . . dxk Pn (x1 , . . . , xk , xk+1 , . . . , xn ) = Pn−k (xk+1 , . . . , xn ), the conditional probability density normalized to one is then Pk|n−k (x1 , . . . , xk |xk+1 , . . . , xn ) =

Pn (x1 , . . . , xn ) . Pn−k (xk+1 , . . . , xn )

1.2.2 The Central Limit Theorem Let there be mutually independent random variables X1 , X2 , . . . , XN which are characterized by common but independent probability distributions w(x1 ), w(x2 ), . . . , w(xN ). Suppose that the mean value and the variance of X1 , X2 , . . ., XN exist. We require the probability density for the sum Y = X1 + X2 + . . . + XN

(1.2.13)

in the limit N → ∞. As we shall see, the probability density for Y is given by a Gaussian distribution. Examples of applications of this situation are a) A system of non-interacting particles Xi = energy of the i-th particle, Y = total energy of the system b) The random walk Xi = distance covered in the i-th step, Y = location after N steps. In order to carry out the computation of the probability density of Y in a convenient way, it is expedient to introduce the random variable Z: √ √ Xi − X / N = Y − N X / N , (1.2.14) Z= i

where X ≡ X1 = . . . = XN by deﬁnition.

8

1. Basic Principles

From (1.2.10), the probability density wZ (z) of the random variables Z is given by x1 + . . . + xN √ √ wZ (z) = dx1 . . . dxN w(x1 ) . . . w(xN ) δ z − + N X N √ −ik(x1 +...+xN ) dk ikz √ +ik N X N e = dx1 . . . dxN w(x1 ) . . . w(xN )e 2π

N dk ikz+ik√NX k e χ √ = , (1.2.15) 2π N where χ(q) is the characteristic function of w(x). The representation (1.2.7) of the characteristic function in terms of the moments of the probability density can be reformulated by taking the logarithm of the expansion in moments,

1 2 χ(q) = exp −iqX − q 2 (∆x) + . . . q 3 + . . . , (1.2.16) 2 i.e. in general ∞ n (−iq) Cn . χ(q) = exp (1.2.16 ) n! n=1 In contrast to (1.2.7), in (1.2.16 ) the logarithm of the characteristic function is expanded in a power series. The expansion coeﬃcients Cn which occur in this series are called cumulants of the nth order . They can be expressed in terms of the moments (1.2.3); the three lowest take on the forms: C1 = X = µ1 2 2 C2 = (∆x) = X 2 − X = µ2 − µ21 2 3 3 C3 = X − 3 X X + 2X = µ3 − 3µ1 µ2 + 2µ31 .

(1.2.17)

The relations (1.2.17) between the cumulants and the moments can be obtained by expanding the exponential function in (1.2.16) or in (1.2.16 ) and comparing the coeﬃcients of the Taylor series with (1.2.7). Inserting (1.2.16) into (1.2.15) yields dk ikz− 1 k2 (∆x)2 +...k3 N − 21 +... 2 wZ (z) = e . (1.2.18) 2π √ From this, neglecting the terms which vanish for large N as 1/ N or more rapidly, we obtain z2 2 −1/2 − 2(∆x) 2 wZ (z) = 2π(∆x) e (1.2.19) and ﬁnally, using WY (y)dy = WZ (z)dz for the probability density of the random variables Y , (y−XN )2 2 −1/2 − 2(∆x)2 N wY (y) = 2πN (∆x) e .

(1.2.20)

1.3 Ensembles in Classical Statistics

9

This is the central limit theorem: wY (y) is a Gaussian distribution, although we did not in any way assume that w(x) was such a distribution, Y = N X √ ∆y = ∆x N √ ∆x N ∆x ∆y √ . = = relative deviation: Y N X X N mean value:

(1.2.21a)

standard deviation:

(1.2.21b) (1.2.21c)

The central limit theorem provides the mathematical basis for the fact that in the limiting case of large N , predictions about Y become sharp. From (1.2.21c), the relative deviation, i.e. the ratio of the standard deviation to the mean value, approaches zero in the limit of large N .

1.3 Ensembles in Classical Statistics Although the correct theory in the atomic regime is based on quantum mechanics, and classical statistics can be derived from quantum statistics, it is more intuitive to develop classical statistics from the beginning, in parallel to quantum statistics. Later, we shall derive the classical distribution function within its range of validity from quantum statistics. 1.3.1 Phase Space and Distribution Functions We consider N particles in three dimensions with coordinates q1 , . . . , q3N and momenta p1 , . . . , p3N . Let us deﬁne phase space, also called Γ space, as the space which is spanned by the 6N coordinates and momenta. A microscopic state is represented by a point in the Γ space and the motion of the overall system by a curve in phase space (Fig. 1.2), which is also termed a phasespace orbit or phase-space trajectory. As an example, we consider the one-dimensional harmonic oscillator q = q0 cos ωt p = −mq0 ω sin ωt ,

(1.3.1)

whose orbit in phase space is shown in Fig. 1.3. For large N , the phase space is a space of many dimensions. As a rule, our knowledge of such a system is not suﬃcient to determine its position in phase space. As already mentioned in the introductory section 1.1, a macrostate characterized by macroscopic values such as that of its energy E, volume V , number of particles N etc., can be generated equally well by any one of a large number of microstates, i.e. by a large number of points in phase space. Instead of singling out just one of these microstates arbitrarily, we consider all of them, i.e. an ensemble of systems which all represent one and the same macrostate but which contains all of the corresponding possible microstates.

10

1. Basic Principles

Fig. 1.2. A trajectory in phase space. Here, q and p represent the 6N coordinates and momenta q1 , . . . , q3N and p1 , . . . , p3N .

Fig. 1.3. The phase-space orbit of the one-dimensional harmonic oscillator.

The weight with which a point (q, p) ≡ (q1 , . . . , q3N , p1 , . . . , p3N ) occurs at the time t is given by the probability density ρ(q, p, t). The introduction of this probability density is now not at all just an expression of our lack of knowledge of the detailed form of the microstates, but rather it has the following physical basis: every realistic macroscopic system, even with the best insulation from its surroundings, experiences an interaction with its environment. This interaction is to be sure so weak that it does not aﬀect the macroscopic properties of the system, i.e. the macrostate remains unchanged, but it induces the system to change its microstate again and again and thus causes it for example to pass through a distribution of microstates during a measurement process. These states, which are occupied during a short time interval, are collected together in the distribution ρ(q, p). This distribution thus describes not only the statistical properties of a ﬁctitious ensemble of many copies of the system considered in its diverse microstates, but also each individual system. Instead of considering the sequential stochastic series of these microstates in terms of time-averaged values, we can observe the simultaneous time development of the whole ensemble. It will be a major task in the following chapter to determine the distribution functions which correspond to particular physical situations. To this end, knowledge of the equation of motion which we derive in the next section will prove to be very important. For large N , we know only the probability distribution ρ(q, p, t). Here,

1.3 Ensembles in Classical Statistics

ρ(q, p, t)dqdp ≡ ρ(q1 , . . . , q3N , p1 , . . . , p3N , t)

3N

dqi dpi

11

(1.3.2)

i=1

is the probability of ﬁnding a system of the ensemble (or the individual systems in the course of the observation) at time t within the phase-space volume element dqdp in the neighborhood of the point q, p in Γ space. ρ(q, p, t) is called the distribution function. It must be positive, ρ(q, p, t) ≥ 0, and normalizable. Here, q, p stand for the whole of the coordinates and momenta q1 , . . . , q3N , p1 , . . . , p3N . 1.3.2 The Liouville Equation We now wish to determine the time dependence of ρ(q, p, t), beginning with the initial distribution W (q0 , p0 ) at time t = 0 on the basis of the classical Hamiltonian H. We shall assume that the system is closed. The following results are however also valid when H contains time-dependent external forces. We ﬁrst consider a system whose coordinates in phase space at t = 0 are q0 and p0 . The associated trajectory in phase space, which follows from the Hamiltonian equations of motion, is denoted by q(t; q0 , p0 ), p(t; q0 , p0 ), with the intitial values of the trajectories given here explicitly. For a single trajectory, the probability density of the coordinates q and the momenta p has the form δ q − q(t; q0 , p0 ) δ p − p(t; q0 , p0 ) . (1.3.3) Here, δ(k) ≡ δ(k1 ) . . . δ(k3N ). The initial values are however in general not precisely known; instead, there is a distribution of values, W (q0 , p0 ). In this case, the probability density in phase space at the time t is found by multiplication of (1.3.3) by W (q0 , p0 ) and integration over the initial values: ρ(q, p, t) = dq0 dp0 W (q0 , p0 )δ q −q(t; q0 , p0 ) δ p−p(t; q0 , p0 ) . (1.3.3 ) We wish to derive an equation of motion for ρ(q, p, t). To this end, we use the Hamiltonian equations of motion q˙i =

∂H ∂H , p˙ i = − . ∂pi ∂qi

The velocity in phase space

∂H ∂H ,− v = (q, ˙ p) ˙ = ∂p ∂q fulﬁlls the equation 2 ∂ q˙i ∂ H ∂ p˙ i ∂2H = =0. + − div v ≡ ∂qi ∂pi ∂qi ∂pi ∂pi ∂qi i i

(1.3.4)

(1.3.4 )

(1.3.5)

12

1. Basic Principles

That is, the motion in phase space can be treated intuitively as the “ﬂow” of an incompressible “ﬂuid”. Taking the time derivative of (1.3.3 ), we ﬁnd ∂ρ(q, p, t) ∂t

∂ ∂ dq0 dp0 W (q0 , p0 ) q˙i (t; q0 , p0 ) + p˙ i (t; q0 , p0 ) =− ∂qi ∂pi i × δ q − q(t; q0 , p0 ) δ p − p(t; q0 , p0 ) . (1.3.6) Expressing the velocity in phase space in terms of (1.3.4), employing the δ-functions in (1.3.6), and ﬁnally using (1.3.3 ) and (1.3.5), we obtain the following representations of the equation of motion for ρ(q, p, t): ∂ ∂ρ ∂ =− ρq˙i + ρp˙ i ∂t ∂qi ∂pi i ∂ρ ∂ρ q˙i + p˙ i =− (1.3.7) ∂qi ∂pi i ∂ρ ∂H ∂ρ ∂H − . + = ∂qi ∂pi ∂pi ∂qi i Making use of the Poisson bracket notation4 , the last line of Eq. (1.3.7) can also be written in the form ∂ρ = − {H, ρ} ∂t

(1.3.8)

This is the Liouville equation, the fundamental equation of motion of the classical distribution function ρ(q, p, t). Additional remarks: We discuss some equivalent representations of the Liouville equation and their consequences. (i) The ﬁrst line of the series of equations (1.3.7) can be written in abbreviated form as an equation of continuity ∂ρ = − div vρ . ∂t

(1.3.9)

One can imagine the motion of the ensemble in phase space to be like the ﬂow of a ﬂuid. Then (1.3.9) is the equation of continuity for the density and Eq. (1.3.5) shows that the ﬂuid is incompressible. 4

{u, v} ≡

P h ∂u i

∂v ∂pi ∂qi

−

∂u ∂v ∂qi ∂pi

i

1.3 Ensembles in Classical Statistics

13

(ii) We once more take up the analogy of motion in phase space to ﬂuid hydrodynamics: in our previous discussion, we considered the density at a ﬁxed point q, p in Γ space. However, we could also consider the motion from the point of view of an observer moving with the “ﬂow”, i.e. we could ask for the time dependence of ρ(q(t), p(t), t) (omitting the initial values of the coordinates, q0 and p0 , for brevity). The second line of Eq. (1.3.7) can also be expressed in the form d ρ q(t), p(t), t = 0 . dt

(1.3.10)

Hence, the distribution function is constant along a trajectory in phase space. (iii) We now investigate the change of a volume element dΓ in phase space. At t = 0, let a number dN of representatives of the ensemble be uniformly distributed within a volume element dΓ0 . Owing to the motion in phase space, they occupy a volume dΓ at the time t. This means that the density ρ at dN t = 0 is given by dΓ , while at time t, it is dN dΓ . From (1.3.10), the equality of 0 these two quantities follows, from which we ﬁnd (Fig. 1.4) that their volumes are the same: dΓ = dΓ0 .

(1.3.11)

Equation (1.3.8) is known in mechanics as the Liouville theorem.5 There, it is calculated from the Jacobian with the aid of the theory of canonical transformations. Reversing this process, we can begin with Eq. (1.3.11) and derive Eq. (1.3.10) and die Liouville equation (1.3.8).

Fig. 1.4. The time dependence of an element in phase space; its volume remains constant.

5

L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics I: Mechanics, Eq. (46.5), Pergamon Press (Oxford, London, Paris 1960)

14

1. Basic Principles

1.4 Quantum Statistics 1.4.1 The Density Matrix for Pure and Mixed Ensembles6 The density matrix is of special importance in the formulation of quantum statistics; it can also be denoted by the terms ‘statistical operator’ and ‘density operator’. Let a system be in the state |ψ. The observable A in this state has the mean value or expectation value A = ψ| A |ψ .

(1.4.1)

The structure of the mean value makes it convenient to deﬁne the density matrix by ρ = |ψ ψ| .

(1.4.2)

We then have: A = Tr(ρA) 2

(1.4.3a)

†

Tr ρ = 1 , ρ = ρ , ρ = ρ .

(1.4.3b,c,d)

Here, the deﬁnition of the trace (Tr) is n| X |n , Tr X =

(1.4.4)

n

where {|n} is an arbitrary complete orthonormal basis system. Owing to Tr X = n|m m| X |n = m| X |n n|m n

=

m

m

n

m| X |m ,

m

the trace is independent of the basis used. n.b. Proofs of (1.4.3a–c): X X Tr ρA = n|ψ ψ| A |n = ψ| A |n n|ψ = ψ| A |ψ , n

n

Tr ρ = Tr ρ11 = ψ| 11 |ψ = 1 , ρ2 = |ψ ψ|ψ ψ| = |ψ ψ| = ρ .

If the systems or objects under investigation are all in one and the same state |ψ, we speak of a pure ensemble, or else we say that the systems are in a pure state. 6

See e.g. F. Schwabl, Quantum Mechanics, 3rd edition, Springer, Heidelberg, Berlin, New York 2002 (corrected printing 2005), Chap. 20. In the following, this textbook will be abbreviated as ‘QM I’.

1.4 Quantum Statistics

15

Along with the statistical character which is inherent to quantum-mechanical systems, in addition a statistical distribution of states can be present in an ensemble. If an ensemble contains diﬀerent states, we call it a mixed ensemble, a mixture, or we speak of a mixed state. We assume that the state |ψ1 occurs with the probability p1 , the state |ψi with the probability pi , etc., with pi = 1 . i

The mean value or expectation value of A is then pi ψi | A |ψi . A =

(1.4.5)

i

This mean value can also be represented in terms of the density matrix deﬁned by ρ= pi |ψi ψi | . (1.4.6) i

We ﬁnd: A = Tr ρA

(1.4.7a)

Tr ρ = 1

(1.4.7b)

ρ2 = ρ

and Tr ρ2 < 1, in the case that pi = 0 for more than one i (1.4.7c)

ρ† = ρ .

(1.4.7d)

The derivations of these relations and further remarks about the density matrices of mixed ensembles will be given in Sect. 1.5.2. 1.4.2 The Von Neumann Equation From the Schr¨ odinger equation and its adjoint i

∂ |ψ, t = H |ψ, t , ∂t

−i

∂ ψ, t| = ψ, t| H , ∂t

it follows that i

∂ ρ = i pi |ψ˙i ψi | + |ψi ψ˙ i | ∂t i = pi (H |ψi ψi | − |ψi ψi | H) . i

16

1. Basic Principles

From this, we ﬁnd the von Neumann equation, ∂ i ρ = − [H, ρ] ; ∂t

(1.4.8)

it is the quantum-mechanical equivalent of the Liouville equation. It describes the time dependence of the density matrix in the Schr¨odinger representation. It holds also for a time-dependent H. It should not be confused with the equation of motion of Heisenberg operators, which has a positive sign on the right-hand side. The expectation value of an observable A is given by At = Tr ρ(t)A , (1.4.9) where ρ(t) is found by solving the von Neumann equation (1.4.8). The time dependence of the expectation value is referred to by the index t. We shall meet up with the von Neumann equation in the next chapter where we set up the equilibrium density matrices, and it is naturally of fundamental importance for all time-dependent processes. We now treat the transformation to the Heisenberg representation. The formal solution of the Schr¨ odinger equation has the form |ψ(t) = U (t, t0 ) |ψ(t0 ) ,

(1.4.10)

where U (t, t0 ) is a unitary operator and |ψ(t0 ) is the initial state at the time t0 . From this we ﬁnd the time dependence of the density matrix: ρ(t) = U (t, t0 )ρ(t0 )U (t, t0 )† .

(1.4.11)

(For a time-independent H, U (t, t0 ) = e−iH(t−t0 )/ .) The expectation value of an observable A can be computed both in the Schr¨ odinger representation and in the Heisenberg representation At = Tr ρ(t)A = Tr ρ(t0 )U (t, t0 )† AU (t, t0 ) = Tr ρ(t0 )AH (t) . (1.4.12) Here, AH (t) = U † (t, t0 )AU (t, t0 ) is the operator in the Heisenberg representation. The density matrix ρ(t0 ) in the Heisenberg representation is timeindependent.

∗ ∗

1.5 Additional Remarks

1.5.1 The Binomial and the Poisson Distributions

We now discuss two probability distributions which occur frequently. Let us consider an interval of length L which is divided into two subintervals [0, a] and [a, L]. We now distribute N distinguishable objects (‘particles’) in

∗

1.5 Additional Remarks

17

a completely random way over the two subintervals, so that the probability that be found in the ﬁrst or the second subinterval is given by La a particle a or 1 − L . The probability that n particles are in the interval [0, a] is then given by the binomial distribution7

a n a N −n N , (1.5.1) 1− wn = L L n where the combinatorial factor N n gives the number of ways of choosing n objects from a set of N . The mean value of n is n =

N

nwn =

n=0

a N L

and its mean square deviation is a a 2 (∆n) = 1− N. L L

(1.5.2a)

(1.5.2b)

We now the limiting case L a. Initially, wn can be written consider N ·(N −1)···(N −n+1) = in the form using N n n! n

1 n−1 a N −n 1 aN 1· 1− ··· 1 − wn = 1− L L n! N N (1.5.3a)

N 1 n−1 1 · (1 − N ) · · · (1 − N ) n n 1 1− , =n n n! N (1 − La ) where for the mean value (1.5.2a), we have introduced the abbreviation a n = aN L . In the limit L → 0, N → ∞ for ﬁnite n, the third factor in (1.5.3a) −n becomes e and the last factor becomes equal to one, so that for the probability distribution, we ﬁnd: wn =

n n −n e . n!

(1.5.3b)

This is the Poisson distribution, which is shown schematically in Fig. 1.5. The Poisson distribution has the following properties: wn = 1 , n = n , (∆n)2 = n . (1.5.4a,b,c) n

The ﬁrst two relations follow immediately from the derivation of the Poisson distribution starting from the binomial distribution. They are obtained in problem 1.5 together with 1.5.4c directly from 1.5.3b. The relative deviation 7

A particular arrangement with n particles in the interval a and N − n in L − a, e.g. the ﬁrst particle in a, the second in L − a, the third in L − a, etc., has the ´N−n ` a ´n ` 1 − Lb . From this we obtain wn through multiplication probability L ` ´ . by the number of combinations, i.e. the binomial coeﬃcient N n

18

1. Basic Principles

Fig. 1.5. The Poisson distribution

is therefore 1 ∆n = 1/2 . n n

(1.5.5)

1 For numbers n which are not too large, e.g. n = 100, ∆n = 10 and ∆n n = 10 . ∆n 20 10 −10 For macroscopic systems, e.g. n = 10 , we have ∆n = 10 and n = 10 . The relative deviation becomes extremely small. For large n, the distribution wn is highly concentrated around n. The probability that no particles at 20 all are within the subsystem, i.e. w0 = e−10 , is vanishingly small. The number of particles in the subsystem [0, a] is not ﬁxed, but however its relative deviation is very small for macroscopic subsystems. In the ﬁgure below (Fig. 1.6a), the binomial distribution for N = 5 and a 3 = L 10 (and thus n = 1.5) is shown and compared to the Poisson distribu3 tion for n = 1.5; in b) the same is shown for N ≡ 10, La = 20 (i.e. again n = 1.5). Even with these small values of N , the Poisson distribution already approximates the binomial distribution rather well. With N = 100, the curves representing the binomial and the Poisson distributions would overlap completely.

Fig. 1.6. Comparison of the Poisson distribution and the binomial distribution

∗

1.5 Additional Remarks

19

∗

1.5.2 Mixed Ensembles and the Density Matrix of Subsystems (i) Proofs of (1.4.7a–d) Tr ρA = pi ψi | A |n n|ψi = pi ψi | A |ψi = A . n

i

i

From this, (1.4.7b) also follows using A = 1. pi pj |ψi ψi |ψj ψj | = ρ . ρ2 = i

j

For arbitrary |ψ, the expectation value of ρ 2 ψ| ρ |ψ = pi |ψ|ψi | ≥ 0 i

is positive deﬁnite. Since ρ is Hermitian, the eigenvalues Pm of ρ are positive and real: ρ |m = Pm |m ρ=

∞

Pm |m m| ,

(1.5.6)

m=1 ∞

Pm ≥ 0,

m|m = δmm .

Pm = 1,

m=1

2 2 In this basis, ρ2 = m Pm |m m| and, clearly, Trρ2 = m Pm < 1, if more than only one state occurs. One can also derive (1.4.7c) directly from (1.4.6), with the condition that at least two diﬀerent but not necessarily orthogonal states must occur in (1.4.6): Tr ρ2 = pi pj ψi |ψj ψj |n n|ψi n

=

i,j

i,j 2

pi pj |ψi |ψj |

, ∆x, < X 4 >, and < X− < X 3 >>.

1.15 The log-normal distribution: Let the statistical variables X have the property that log X obeys a Gaussian distribution with < log X >= log x0 . (a) Show by transforming the Gaussian distribution that the probability density for X has the form P (x) = √

2 0 )) 1 − (log(x/x 1 2σ2 , 0 < x < ∞. e 2 2πσ x

(b) Show that < X >= x0 eσ

2

/2

and < log X >= log x0 . (c) Show that the log-normal distribution can be rewritten in the form P (x) =

1 √ (x/x0 )−1−µ(x) x0 2πσ 2

with µ(x) =

1 x log ; 2σ 2 x0

it can thus be easily confused with a power law when analyzing data.

2. Equilibrium Ensembles

2.1 Introductory Remarks As emphasized in the Introduction, a macroscopic system consists of 1019 − 1023 particles and correspondingly has an energy spectrum with spacings of ∆E ∼ e−N . The attempt to ﬁnd a detailed solution to the microscopic equations of motion of such a system is hopeless; furthermore, the required initial conditions or quantum numbers cannot even be speciﬁed. Fortunately, knowledge of the time development of such a microstate is also superﬂuous, since in each observation of the system (both of macroscopic quantities and of microscopic properties, e.g. the density correlation function, particle diﬀusion, etc.), one averages over a ﬁnite time interval. No system can be strictly isolated from its environment, and as a result it will undergo transitions into many diﬀerent microstates during the measurement process. Figure 2.1 illustrates schematically how the system moves between various phase-space trajectories. Thus, a many-body system cannot be characterized by a single microstate, but rather by an ensemble of microstates. This statistical ensemble of microstates represents the macrostate which is speciﬁed by the macroscopic state variables E, V, N, . . .1 (see Fig. 2.1). p

q 1

Fig. 2.1. A trajectory in phase space (schematic)

A diﬀerent justiﬁcation of the statistical description is based on the ergodic theorem: nearly every microstate approaches arbitrarily closely to all the states of the corresponding ensemble in the course of time. This led Boltzmann to postulate that the time average for an isolated system is equal to the average over the states in the microcanonical ensemble (see Sect. 10.5.2).

26

2. Equilibrium Ensembles

Experience shows that every macroscopic system tends with the passage of time towards an equilibrium state, in which i ρ˙ = 0 = − [H, ρ]

(2.1.1)

must hold. Since, according to Eq. (2.1.1), in equilibrium the density matrix ρ commutes with the Hamiltonian H, it follows that in an equilibrium ensemble ρ can depend only on the conserved quantities. (The system changes its microscopic state continually even in equilibrium, but the distribution of microstates within the ensemble becomes time-independent.) Classically, the right-hand side of (2.1.1) is to be replaced by the Poisson bracket.

2.2 Microcanonical Ensembles 2.2.1 Microcanonical Distribution Functions and Density Matrices We consider an isolated system with a ﬁxed number of particles, a ﬁxed volume V , and an energy lying within the interval [E, E + ∆] with a small ∆, whose Hamiltonian is H(q, p) (Fig. 2.2). Its total momentum and total angular momentum may be taken to be zero.

Fig. 2.2. Energy shell in phase space

We now wish to ﬁnd the distribution function (density matrix) for this physical situation. It is clear from the outset that only those points in phase space which lie between the two hypersurfaces H(q, p) = E and H(q, p) = E + ∆ can have a ﬁnite statistical weight. The region of phase space between the hypersurfaces H(q, p) = E and H(q, p) = E + ∆ is called the energy shell. It is intuitively plausible that in equilibrium, no particular region of the energy shell should play a special role, i.e. that all points within the energy shell should have the same statistical weight. We can indeed derive this fact by making use of the conclusion following (2.1.1). If regions within the energy shell had diﬀerent statistical weights, then the distribution function (density matrix) would depend on other quantities besides H(q, p), and ρ would not commute with H (classically, the Poisson bracket would not vanish). Since

2.2 Microcanonical Ensembles

27

for a given E, ∆, V , and N , the equilibrium distribution function depends only upon H(q, p), it follows that every state within the energy shell, i.e. all of the points in Γ space with E ≤ H(q, p) ≤ E + ∆, are equally probable. An ensemble with these properties is called a microcanonical ensemble. The associated microcanonical distribution function can be postulated to have the form 1 E ≤ H(q, p) ≤ E + ∆ ρM C = Ω (E)∆ (2.2.1) 0 otherwise , where, as postulated, the normalization constant Ω (E) depends only on E, but not on q and p. Ω (E)∆ is the volume of the energy shell.2 In the limit ∆ → 0, (2.2.1) becomes ρM C =

1 δ E − H(q, p) . Ω (E)

The normalization of the probability density determines Ω (E): dq dp ρM C = 1 . h3N N ! The mean value of a quantity A is given by dq dp A = ρM C A . h3N N !

(2.2.1 )

(2.2.2)

(2.2.3)

The choice of the fundamental integration variables (whether q or q/const) is arbitrary at the present stage of our considerations and was made in (2.2.2) and (2.2.3) by reference to the limit which is found from quantum statistics. If the factor (h3N N !)−1 were not present in the normalization condition (2.2.2) and in the mean value (2.2.3), then ρM C would be replaced by (h3N N !)−1 ρM C . All mean values would remain unchanged in this case; the diﬀerence however would appear in the entropy (Sect. 2.3). The factor 1/N ! results from the indistinguishability of the particles. The necessity of including the factor 1/N ! was discovered by Gibbs even before the development of quantum mechanics. Without this factor, an entropy of mixing of identical gases would erroneously appear (Gibbs’ paradox). That is, the sum of the entropies of two identical ideal gases each consisting of N particles, 2SN , would be smaller than the entropy of one gas consisting of 2N particles. Mixing of ideal gases will be treated in Chap. 3, Sect. 3.6.3.4. We also refer to the calculation of the entropy of mixtures of ideal gases in Chap. 5 and the last paragraph of Appendix B.1. 2

The surface area of the energy shell Ω (E) depends not only on the energy E but also on the spatial volume V and the number of particles N . For our present considerations, only its dependence on E is of interest; therefore, for clarity and brevity, we omit the other variables. We use a similar abbreviated notation for the partition functions which will be introduced in later sections, also. The complete dependences are collected in Table 2.1.

28

2. Equilibrium Ensembles

For the 6N -dimensional volume element in phase space, we will also use the abbreviated notation dΓ ≡

dq dp . h3N N !

From the normalization condition, (2.2.2), and the limiting form given in (2.2.1 ), it follows that dq dp δ E − H(q, p) . (2.2.4) Ω (E) = 3N h N! After introducing coordinates on the energy shell and an integration variable along the normal k⊥ , (2.2.4) can also be given in terms of the surface integral: Ω (E) =

dS h3N N !

dk⊥ δ E − H(SE ) − |∇H|k⊥

E

=

dS h3N N !

1 . (2.2.4 ) |∇H(q, p)|

Here, dS is the diﬀerential element of surface area in the (6N −1)-dimensional hypersurface at energy E, and ∇ is the 6N -dimensional gradient in phase space. In Eq. (2.2.4 ) we have used H(SE ) = E and performed the integration over k⊥ . According to Eq. (1.3.4 ), it holds that |∇ H(q, p)| = |v| and the velocity in phase space is perpendicular to the gradient, i.e. v ⊥ ∇ H(q, p). This implies that the velocity is always tangential to the surface of the energy shell; cf. problem 1.8. Notes: (i) Alternatively, the expression (2.2.4 ) can be readily proven by starting with an energy shell of ﬁnite width ∆ and dividing it into segments dS∆k⊥ . Here, dS is a surface element and ∆k⊥ is the perpendicular distance between the two hypersurfaces (Fig. 2.3). Since the gradient yields the variation perpendicular to an equipotential surface, we ﬁnd |∇H(q, p)|∆k⊥ = ∆, where ∇H(q, p) is to be computed on the hypersurface H(q, p) = E.

Fig. 2.3. Calculation of the volume of the energy shell

2.2 Microcanonical Ensembles

29

From this it follows that Z Z dS dS = Ω (E)∆ = ∆k ·∆, ⊥ h3N N ! h3N N !|∇H(q, p)| i.e. we again obtain (2.2.4 ). (ii) Equation (2.2.4 ) has an intuitively very clear signiﬁcance. Ω (E) is given by the sum of the surface elements, each divided by the velocity in phase space. Regions with high velocity thus contribute less to Ω (E). In view of the ergodic hypothesis (see 10.5.2), this result is very plausible. See problem 1.8: |v| = |∇H| and v ⊥ ∇H.

As already mentioned, Ω (E)∆ is the volume of the energy shell in classical statistical mechanics. We will occasionally also refer to Ω (E) as the “phase surface”. We also deﬁne the volume inside the energy shell: dq dp ¯ Ω(E) = Θ E − H(q, p) . (2.2.5) 3N h N! Clearly, the following relation holds: Ω (E) =

¯ dΩ(E) . dE

(2.2.6)

Quantum mechanically, the deﬁnition of the microcanonical ensemble for an isolated system with the Hamiltonian H and associated energy eigenvalues En is: p(En ) |n n| , (2.2.7) ρM C = n

where, analogously to (2.2.1), p(En ) =

1 Ω (E)∆

E ≤ En ≤ E + ∆

0

otherwise .

(2.2.8)

In the microcanonical density matrix ρM C , all the energy eigenstates |n whose energy En lies in the interval [E, E + ∆] contribute with equal weights. The normalization Tr ρM C = 1

(2.2.9a)

yields Ω (E) =

1 1, ∆ n

(2.2.9b)

where the summation is restricted to energy eigenstates within the energy shell. Thus Ω (E)∆ is equal to the number of energy eigenstates within the energy shell [E, E + ∆]. For the density matrix of the microcanonical ensemble, an abbreviated notation is also used: ρM C = Ω (E)−1 δ(H − E)

(2.2.7 )

30

2. Equilibrium Ensembles

and (2.2.9b )

Ω (E) = Tr δ(H − E) .

Equation (2.2.8) (and its classical analogue (2.2.1)) represent the fundamental hypothesis of equilibrium statistical mechanics. All the equilibrium properties of matter (whether isolated or in contact with its surroundings) can be deduced from them. The microcanonical density matrix describes an isolated system with given values of E, V , and N . The equilibrium density matrices corresponding to other typical physical situations, such as those of the canonical and the grand canonical ensembles, can be derived from it. As we shall see in the following examples, in fact essentially the whole volume within the hypersurface H(q, p) = E lies at its surface. More precisely, comparison of ¯ Ω(E) and Ω (E)∆ shows that E ¯ log Ω (E)∆ = log Ω(E) + O log . N∆ ¯ Since log Ω (E)∆ and log Ω(E) are both proportional to N , the remaining terms can be neglected for large N ; in this spirit, we can write ¯ Ω (E)∆ = Ω(E) .

2.2.2 The Classical Ideal Gas In this and in the next section, we present three simple examples for which Ω (E) can be calculated, and from which we can directly read oﬀ the characteristic dependences on the energy and the particle number. We shall now investigate the classical ideal gas, i.e. a classical system of N atoms between which there are no interactions at all; and we shall see from it how Ω (E) depends on the energy E and on the particle number N . Furthermore, we will make use of the results of this section later to derive the thermodynamics of the ideal gas. The Hamiltonian of the three-dimensional ideal gas is H=

N p2i + Vwall . 2m i=1

(2.2.10)

Here, the pi are the cartesian momenta of the particles and Vwall is the potential representing the wall of the container. The surface area of the energy shell is in this case N 1 p2i Ω (E) = 3N , d3 x1 . . . d3 xN d3 p1 . . . d3 pN δ E − h N! 2m i=1 V

V

(2.2.11)

2.2 Microcanonical Ensembles

31

where the integrations over x are restricted to the spatial volume V deﬁned by the walls. It would be straightforward to calculate Ω (E) directly. We shall ¯ carry out this calculation here via Ω(E), the volume inside the energy shell, which in this case is a hypersphere, in order to have both quantities available: 1 ¯ Ω(E) = 3N h N ! p2i /2m . (2.2.12) × d3 x1 . . . d3 xN d3 p1 . . . d3 pN Θ E − V

i

V

Introducing the surface area of the d-dimensional unit sphere,3 2π d/2 , (2π)d Kd ≡ dΩd = Γ (d/2)

(2.2.13)

we ﬁnd, representing the momenta in spherical polar coordinates, ¯ Ω(E) =

V

N

√

h3N N !

dΩ3N

2mE dp p3N −1 . 0

From this, we immediately obtain 3N

V N (2πmE) 2 ¯ , (2.2.14) Ω(E) = 3N h N !( 3N 2 )! 3N where Γ ( 3N 2 )= 2 − 1 ! was used, under the assumption – without loss of generality – of an even number of particles. For large N , Eq. (2.2.14) can be simpliﬁed by applying the Stirling formula (see problem 1.1). N ! ∼ N N e−N (2πN )1/2 ,

(2.2.15)

whereby it suﬃces to retain only the ﬁrst two factors, which dominate the expression. Then

¯ Ω(E) ≈

V N

N

4πmE 3h2 N

3N 2 e

5N 2

.

(2.2.16)

Making use of Eq. (2.2.6), we obtain from (2.2.14) and (2.2.16) the exact result for Ω (E): 3N −1 V N 2πm 2πmE 2 Ω (E) = h3N N ! 3N 2 −1 !

(2.2.17)

as well as an asymptotic expression which is valid in the limit of large N : 3

The derivation of (2.2.13) will be given at the end of this section.

32

2. Equilibrium Ensembles

Ω (E) ≈

N

V N

4πmE 3h2 N

3N 2 e

5N 2

1 3N . E 2

(2.2.18)

In (2.2.16) and (2.2.18), the speciﬁc volume V /N and the speciﬁc energy ¯ E/N occur to the power N . We now compare Ω(E), the volume inside the energy shell, with Ω (E)∆, the volume of a spherical shell of thickness ∆, by considering the logarithms of these two quantities (due to the occurrence of the N th powers): E ¯ log Ω (E)∆ = log Ω(E) + O log . (2.2.19) N∆ ¯ Since log Ω (E)∆ and log Ω(E) are both proportional to N , the remaining terms can be neglected in the case that N is large. In this approximation, we ﬁnd ¯ Ω (E)∆ ≈ Ω(E) ,

(2.2.20)

i.e. nearly the whole volume of the hypersphere H(q, p) ≤ E lies at its surface. This fact is due to the high dimensionality of the phase space, and it is to be expected that (2.2.20) remains valid even for systems with interactions. We now prove the expression (2.2.13) for the surface area of the d-dimensional unit sphere. To this end, we compute the d-dimensional Gaussian integral Z∞

Z∞ dp1 . . .

I= −∞

√ 2 2 dpd e−(p1 +···+pd ) = ( π)d .

(2.2.21)

−∞

This integral can also be written in spherical polar coordinates:4 Z Z Z ∞ Z Z 2 d 1 1 “d” I= dp pd−1 dΩd e−p = dt t 2 −1 e−t dΩd = Γ dΩd , (2.2.22) 2 2 2 0 where Z∞ Γ (z) =

dt tz−1 e−t

(2.2.23)

0

is the gamma function. Comparison of the two expressions (2.2.21) and (2.2.22) yields Z 2π d/2 . (2.2.13 ) dΩd = Γ (d/2)

In order to gain further insights into how the volume of the energy shell depends upon the parameters of the microcanonical ensemble, we will calculate 4

We denote an element of surface area on the R d-dimensional unit sphere by dΩd . For the calculation of the surface integral dΩd , it is not necessary to use the detailed expression for dΩd . The latter may be found in E. Madelung, Die Mathematischen Hilfsmittel des Physikers, Springer, Berlin, 7th edition (1964), p. 244.

2.2 Microcanonical Ensembles

33

Ω (E) for two other simple examples, this time quantum-mechanical systems; these are: (i) harmonic oscillators which are not coupled, and (ii) paramagnetic (not coupled) spins. Simple problems of this type can be solved for all ensembles with a variety of methods. Instead of the usual combinatorial method, we employ purely analytical techniques for the two examples which follow. ∗

2.2.3 Quantum-mechanical Harmonic Oscillators and Spin Systems ∗

2.2.3.1 Quantum-mechanical Harmonic Oscillators

We consider a system of N identical harmonic oscillators, which are either not coupled to each other at all, or else are so weakly coupled that their interactions may be neglected. Then the Hamiltonian for the system is given by: H=

N X

„ ω

a†j aj

j=1

1 + 2

« ,

(2.2.24)

where a†j (aj ) are creation (annihilation) operators for the jth oscillator. Thus we have Ω (E) =

∞ X n1 =0

=

∞ X

nN

···

n1 =0

“ X` 1 ´” δ E − ω nj + 2 =0 j

∞ X

···

` ´ Z ∞ Z N X dk ik E−Pj ω(nj + 12 ) dk ikE Y e−ikω/2 , = e e 2π 2π 1 − e−ikω n =0 i=1 N

(2.2.25) and ﬁnally Z Ω (E) =

dk N e 2π

`

ik(E/N)−log(2i sin(kω/2))

´ .

(2.2.26)

The computation of this integral can be carried out for large N using the saddlepoint method.5 The function ` ´ f (k) = ike − log 2i sin(kω/2) (2.2.27) with e = E/N has a maximum at the point k0 =

e+ 1 log ωi e−

ω 2 ω 2

.

(2.2.28)

This maximum can be determined by setting the ﬁrst derivative of (2.2.27) equal to zero 5

N.G. de Bruijn, Asymptotic Methods in Analysis, (North Holland, 1970); P. M. Morse and H. Feshbach, Methods of Theoretical Physics, p. 434, (McGraw Hill, New York, 1953).

34

2. Equilibrium Ensembles f (k0 ) = ie −

ω k0 ω cot =0. 2 2

Therefore, with ” “ p f (k0 ) = ik0 e − log 2i/ 1 − (2e/ω)2 «„ « « „„ e + ω ω . 1 e ω 2 2 e − (ω) + = log log e + ω 2 2 2 e − ω 2 and f (k0 ) =

(2.2.29)

` ω ´2 ‹

sin2 (k0 ω/2), we ﬁnd for Ω (E): Z 2 1 1 Nf (k0 ) e Ω (E) = dk eN 2 f (k0 )(k−k0 ) . 2π 2

(2.2.30)

√ The integral in this expression yields only a factor proportional to N ; thus, the number of states is given by –ﬀ j » e + 12 ω e + 12 ω e − 12 ω e − 12 ω . (2.2.31) Ω (E) = exp N log − log ω ω ω ω ∗

2.2.3.2 Two-level Systems: the Spin- 21 Paramagnet

As our third example, we consider a system of N particles which can occupy one of two states. The most important physical realization of such a system is a paramagnet in a magnetic ﬁeld H (h = −µB H), which has the Hamiltonian6 H = −h

N X

σi ,

with

σi = ±1.

(2.2.32)

i=1

The number of states of energy E is, from (2.2.1), given by X

Ω (E) =

{σi =±1}

Z =

N X ´ ` δ E+h σi =

Z

i=1

dk ikE e (2 cos kh)N = 2N 2π

dk 2π Z

X {σi =±1}

eik(E+h

P i

σi )

(2.2.33)

dk f (k) e 2π

with f (k) = ikE + N log cos kh .

(2.2.34)

The computation of the integral can again be accomplished by applying the saddlepoint method. Using f (k) = iE −N h tan kh and f (k) = −N h2 / cos2 kh, we obtain 6

In the literature of magnetism, it is usual to denote the magnetic ﬁeld by H or H. To distinguish it from the Hamiltonian in the case of magnetic phenomena, we use the symbol H for the latter.

2.3 Entropy

35

from the condition f (k0 ) = 0 k0 h = arctan

iE i 1 + E/N h = log . Nh 2 1 − E/N h

For the second derivative, we ﬁnd ´ ` f (k0 ) = − 1 − (E/N h)2 N h2 ≤ 0

for

− Nh ≤ E ≤ Nh .

Thus, using the abbreviation e = E/N h, we have «Z „ ` ´ dk − 12 −f (k0 ) (k−k0 )2 Ne 1+e 1 Ω (E) = 2N exp − log + N log √ e 2 1−e 2π 1 − e2 « „ N ` ´ 1+e N 1 2 1 Ne 2 2 log + log log (1 − e = √ exp − − )N h 2 1−e 2 1 − e2 2 2π n N 1+e N 1−e 1 − (1 − e) log − = √ exp − (1 + e) log 2 2 2 2 2π o 1 1 − log(1 − e2 ) − log N h2 , 2 2 » – ﬀ j 1+e N 1−e (1 + e) log + O(1, log N ) . Ω (E) = exp − + (1 − e) log 2 2 2 (2.2.35) We have now calculated the number of states Ω (E) for three examples. The physical consequences of the characteristic energy dependences will be discussed after we have introduced additional concepts such as those of entropy and temperature.

2.3 Entropy 2.3.1 General Deﬁnition Let an arbitrary density matrix ρ be given; then the entropy S is deﬁned by S = −k Tr (ρ log ρ) ≡ −klog ρ .

(2.3.1)

Here, we give the formulas only in their quantum-mechanical form, as we shall often do in this book. For classical statistics, the trace operation Tr is to be read as an integration over phase space. The physical meaning of S will become clear in the following sections. At this point, we can consider the entropy to be a measure of the size of the accessible part of phase space, and thus also of the uncertainty of the microscopic state of the system: the more states that occur in the density matrix, the greater the entropy S. For 1 example, for M states which occur with equal probabilities M , the entropy is given by M 1 1 S = −k log = k log M . M M 1

36

2. Equilibrium Ensembles

For a pure state, M = 1 and the entropy is therefore S = 0. In the diagonal representation of ρ (Eq. 1.4.8), one can immediately see that the entropy is positive semideﬁnite: S = −k Pn log Pn ≥ 0 (2.3.2) n

since x log x ≤ 0 in the interval 0 < x ≤ 1 (see Fig. 2.4). The factor k in (2.3.1) is at this stage completely arbitrary. Only later, by identifying the temperature scale with the absolute temperature, do we ﬁnd that it is then given by the Boltzmann constant k = 1.38 × 10−16 erg/K = 1.38 × 10−23J/K. See Sect. 3.4. The value of the Boltzmann constant was determined by Planck in 1900. The entropy is also a measure of the disorder and of the lack of information content in the density matrix. The more states contained in the density matrix, the smaller the weight of each individual state, and the less information about the system one has. Lower entropy means a higher information content. If for example a volume V is available, but the particles remain within a subvolume, then the entropy is smaller than if they occupied the whole of V . Correspondingly, the information content ( ∝ Tr ρ log ρ) of the density matrix is greater, since one knows that the particles are not anywhere within V , but rather only in the subvolume.

2.3.2 An Extremal Property of the Entropy Let two density matrices, ρ and ρ1 , be given. The important inequality Tr ρ(log ρ1 − log ρ) ≤ 0 . (2.3.3) then holds. To prove (2.3.3), we use the diagonal representations of ρ = P |n n| and ρ = P |ν ν|: n 1 1ν n ν Pn n| (log ρ1 − log Pn ) |n = Tr ρ(log ρ1 − log ρ) = n

ρ1 P1ν = Pn n| log |n = Pn n|ν ν| log |ν ν|n = P Pn n n n ν P ρ 1ν 1 ≤ Pn n|ν ν| − 1 |ν ν|n = Pn n| − 1 |n = Pn Pn n ν n = Tr ρ1 − Tr ρ = 0 . In an intermediate step, we used the basis |ν of ρ1 as well as the inequality log x ≤ x − 1. This inequality is clear from Fig. 2.4. Formally, it follows from properties of the function f (x) = log x − x + 1: f (1) = 0,

f (1) = 0,

f (x) = −

1 < 0 (i.e. f (x) is convex). x2

2.3 Entropy

37

Fig. 2.4. Illustrating the inequality log x ≤ x−1

2.3.3 Entropy of the Microcanonical Ensemble For the entropy of the microcanonical ensemble, we obtain by referring to (2.3.1) and (2.2.7) SM C = −k Tr ρM C log ρM C = −k Tr ρM C log

1 , Ω (E)∆

and, since the density matrix is normalized to 1, Eq. (2.2.9a), the ﬁnal result: SM C = k log Ω (E)∆ . (2.3.4) The entropy is thus proportional to the logarithm of the accessible phase space volume, or, quantum mechanically, to the logarithm of the number of accessible states. We shall now demonstrate an interesting extremal property of the entropy. Of all the ensembles whose energy lies in the interval [E, E + ∆], the entropy of the microcanonical ensemble is greatest. To prove this statement, we set ρ1 = ρM C in (2.3.3) and use the fact that ρ, like ρM C , diﬀers from zero only on the energy shell 1 = SM C . S[ρ] ≤ −k Tr ρ log ρM C = −k Tr ρ log (2.3.5) Ω (E)∆ Thus, we have demonstrated that the entropy is maximal for the microcanonical ensemble. We note also that for large N , the following representations of the entropy are all equivalent: ¯ SM C = k log Ω (E)∆ = k log Ω (E)E = k log Ω(E) .

(2.3.6)

This follows from the neglect of logarithmic terms in (2.2.19) and an analogous relation for Ω (E)E. We can now estimate the density of states. The spacing ∆E of the energy levels is given by ∆E =

∆ = ∆ · e−SMC /k ∼ ∆ · e−N . Ω (E)∆

(2.3.7)

38

2. Equilibrium Ensembles

The levels indeed lie enormously close together, i.e. at a high density, as already presumed in the Introduction. For this estimate, we used S = k log Ω (E)∆ ∝ N ; this can be seen from the classical results, (2.2.18) as well as (2.2.31) and (2.2.35).

2.4 Temperature and Pressure The results for the microcanonical ensemble obtained thus far permit us to calculate the mean values of arbitrary operators. These mean values depend on the natural parameters of the microcanonical ensemble, E, V , and N . The temperature and pressure have so far not made an appearance. In this section, we want to deﬁne these quantities in terms of the energy and volume derivatives of the entropy. 2.4.1 Systems in Contact: the Energy Distribution Function, Deﬁnition of the Temperature We now consider the following physical situation: let a system be divided into two subsystems, which interact with each other, i.e. exchange of energy between the two subsystems is possible. The overall system is isolated. The division into two subsystems 1 and 2 is not necessarily spatial. Let the Hamiltonian of the system be H = H1 + H2 + W . Let further the interaction W be small in comparison to H1 and H2 . For example, in the case of a spatial separation, the surface energy can be supposed to be small compared to the volume energy. The interaction is of fundamental importance, in that it allows the two subsystems to exchange energy. Let the overall system have the energy E, so that it is described by a microcanonical density matrix: ρM C = Ω1,2 (E)−1 δ(H1 + H2 + W − E) ≈ Ω1,2 (E)−1 δ(H1 + H2 − E) . (2.4.1) Here, W was neglected relative to H1 and H2 , and Ω1,2 (E) is the phase-space surface of the overall system with a dividing wall (see remarks at the end of this section).

Fig. 2.5. An isolated system divided into subsystems 1 and 2 separated by a ﬁxed diathermal wall (which permits the exchange of thermal energy)

2.4 Temperature and Pressure

39

ω (E1 ) denotes the probability density for subsystem 1 to have the energy E1 . According to Eq. (1.2.10), ω (E1 ) is given by ω (E1 ) = δ(H1 − E1 ) = dΓ1 dΓ2 Ω1,2 (E)−1 δ(H1 + H2 − E)δ(H1 − E1 ) =

Ω2 (E − E1 )Ω1 (E1 ) . (2.4.2a) Ω1,2 (E)

Here, (2.4.1) was used and we have introduced the phase-space surfaces of subsystem 1, Ω1 (E1 ) = dΓ1 δ(H1 − E1 ), and subsystem 2, Ω2 (E − E1 ) = dΓ2 δ(H2 − E + E1 ). The most probable value of E1 , denoted as E˜1 , can be (E1 ) found from dωdE = 0: 1 −Ω2 (E − E1 )Ω1 (E1 ) + Ω2 (E − E1 )Ω1 (E1 ) = 0 . E˜1

Using formula (2.3.4) for the microcanonical entropy, we obtain ∂ ∂ S2 (E2 ) = S1 (E1 ) . ˜ ∂E2 ∂E1 E−E1 E˜1

(2.4.3)

We now introduce the following deﬁnition of the temperature: ∂ S(E) . ∂E Then it follows from (2.4.3) that T −1 =

(2.4.4)

T1 = T2 .

(2.4.5)

In the most probable conﬁguration, the temperatures of the two subsystems are equal. We are already using partial derivatives here, since later, several variables will occur. For the ideal gas, we can see immediately that the temperature increases proportionally to the energy per particle, T ∝ E/N . This property, as well as (2.4.5), the equality of the temperatures of two systems which are in contact and in equilibrium, correspond to the usual concept of temperature. Remarks: The Hamiltonian has a lower bound and possesses a ﬁnite smallest eigenvalue E0 . In general, the Hamiltonian does not have an upper bound, and the density of the energy eigenvalues increases with increasing energy. As a result, the temperature cannot in general be negative, (T ≥ 0), and it increases with increasing energy. For spin systems there is also an upper limit to the energy. The density of states then again decreases as the upper limit is approached, so that in this energy range, Ω /Ω < 0 holds. Thus in such systems there can be states with a negative absolute temperature (see Sect. 6.7.2). Due to the various possibilities for representing the entropy as given in (2.3.6), the d −1 ¯ temperature can also be written as T = k dE log Ω(E) .

40

2. Equilibrium Ensembles

Notes concerning Ω1,2 (E) in Eq. (2.4.1); may be skipped over in a ﬁrst reading: (i) In (2.4.1 and 2.4.2a), it must be taken into account that subsystems 1 and 2 are separated from each other. The normalization factor Ω1,2 (E) which occurs in (2.4.1) and (2.4.2a) is not given by Z Z dq dp δ(H − E) ≡ Ω(E) , dΓ δ(H − E) ≡ h3N N ! but instead by Z

Z dq1 dp1 dq2 dp2 dΓ1 dΓ2 δ(H − E) ≡ δ(H − E) N1 !h3N1 N2 !h3N2 Z Z = dE1 dΓ1 dΓ2 δ(H − E)δ(H1 − E1 ) Z Z = dE1 dΓ1 dΓ2 δ(H2 − E + E1 )δ(H1 − E1 ) Z = dE1 Ω1 (E1 )Ω2 (E − E1 ) .

Ω1,2 (E) =

(2.4.2b)

(ii) Quantum mechanically, one obtains the same result for (2.4.2a): “ ” 1 ω (E1 ) = δ(H1 − E1 ) ≡ Tr δ(H1 + H2 − E)δ(H1 − E1 ) Ω1,2 (E) ” “ ´ ` 1 = Tr 1 Tr 2 δ H2 − (E − E1 ) δ(H1 − E1 ) Ω1,2 (E) Ω1 (E1 )Ω2 (E − E1 ) = Ω1,2 (E) and Z ` ´ Ω1,2 (E) = Tr δ(H1 + H2 − E) ≡ dE1 Tr δ(H1 + H2 − E)δ(H1 − E1 ) Z Z ` ´ = dE1 Tr δ(H2 − E + E1 )δ(H1 − E1 ) = dE1 Ω1 (E1 )Ω2 (E − E1 ) . Here, we have used the fact that for the non-overlapping subsystems 1 and 2, the traces Tr 1 and Tr 2 taken over parts 1 and 2 are independent, and the states must be symmetrized (or antisymmetrized) only within the subsystems. (iii) We recall that for quantum-mechanical particles which are in non-overlapping states (wavefunctions), the symmetrization (or antisymmetrization) has no eﬀect on expectation values, and that therefore, in this situation, the symmetrization does not need to be carried out at all.7 More precisely: if one considers the matrix elements of operators which act only on subsystem 1, their values are the same independently of whether one takes the existence of subsystem 2 into account, or bases the calculation on the (anti-)symmetrized state of the overall system.

7

See e.g. G. Baym, Lectures on Quantum Mechanics (W.A. Benjamin, New York, Amsterdam 1969), p. 393

2.4 Temperature and Pressure

41

2.4.2 On the Widths of the Distribution Functions of Macroscopic Quantities 2.4.2.1 The Ideal Gas For the ideal gas, from (2.2.18) one ﬁnds the following expression for the probability density of the energy E1 , Eq. (2.4.2a): ω (E1 ) ∝ (E1 /N1 )3N1 /2 (E2 /N2 )3N2 /2 .

(2.4.6)

In equilibrium, from the equality of the temperatures [Eq. (2.4.3)], i.e. from ˜1 ) ˜2 ) ∂S(E ∂S(E N1 N2 ˜ = E−E ˜ and thus ∂E1 = ∂E2 , we obtain the condition E 1

˜1 = E E

1

N1 . N1 + N2

(2.4.7)

If we expand the distribution function ω (E1 ) around the most probable en(E1 ) ergy value E˜1 , using dωdE |E˜1 = 0 and terminating the expansion after the 1 quadratic term, we ﬁnd

˜ 1 ) + 1 − 3 N1 − 3 N2 ˜1 2 , log ω (E1 ) = log ω(E E1 − E 2 2 ˜ ˜ 2 2 E1 2 E2 and therefore 3 N1 +N2 ˜ E ˜ E 1 2

˜ 1 ) e− 4 ω (E1 ) = ω(E where

N1 ˜2 E 1

+

N2 ˜2 E

=

2

N2 ˜1 E ˜2 E

+

˜ 1 )2 (E1 −E

N1 ˜1 E ˜2 E

=

˜1 ) e− 4 N1 N2 e¯2 (E1 −E1 ) , = ω(E

N ˜1 E ˜2 E

3

N

˜

2

(2.4.8)

and e¯ = E/N were used. Here,

log ω (E1 ) rather than ω (E1 ) was expanded, because of the occurrence of the powers of the particle numbers N1 and N2 in Eq. (2.4.6). This is also preferable since it permits the coeﬃcients of the Taylor expansion to be expressed in terms of derivatives of the entropy. From (2.4.8), we obtain the relative mean square deviation: ˜1 ) 2 ˜2 ˜1 E (E1 − E 2 1 N2 1 2 E = = 2 ≈ 10−20 (2.4.9) 2 ˜ ˜ 3 N N1 E1 E1 3 (N1 + N2 ) and the relative width of the distribution, with N2 ≈ N1 , ∆E1 1 ∼ √ . N E˜1

(2.4.10)

For macroscopic systems, the distribution is very sharp. The most probable state occurs with a stupendously high probability. The sharpness of the distribution function becomes even more apparent if one expresses it in terms of the energy per particle, e1 = E1 /N1 , including the normalization factor: 3N N1 3 N N1 4N (e1 −˜ e1 )2 ωe1 (e1 ) = e¯e 2 e¯2 . 4π N2

42

2. Equilibrium Ensembles

2.4.2.2 A General Interacting System For interacting systems it holds quite generally that: An arbitrary quantity A, which can be written as a volume integral over a density A(x), A = d3 x A(x) . (2.4.11) V

Its average value depends on the volume as d3 xA(x) ∼ V .

A =

(2.4.12)

V

The mean square deviation is given by 2

A − A A − A = d3 x d3 x A(x) − A(x) A(x ) − A(x ) ∝ V l3 .

(∆A) =

V

V

(2.4.13) Both the integrals in (2.4.13) are to be taken over the volume V . The correlation function in the integral however vanishes for |x − x | > l, where l is the range of the interactions (the correlation length). The latter is ﬁnite and thus the mean square deviation is likewise only of the order of V and not, as one might perhaps naively expect, quadratic in V . The relative deviation of A is therefore given by ∆A 1 ∼ 1/2 . A V

(2.4.14)

2.4.3 External Parameters: Pressure Let the Hamiltonian of a system depend upon an external parameter a: H = H(a). This external parameter can for example be the volume V of ¯ we can derive an expression the system. Using the volume in phase space, Ω, for the total diﬀerential of the entropy dS. Starting from the phase-space volume ¯ (E, a) = dΓ Θ E − H(a) , Ω (2.4.15) we take its total diﬀerential

2.4 Temperature and Pressure

¯ (E, a) = dΩ

43

∂H da dΓ δ E − H(a) dE − ∂a ∂H da , (2.4.16) = Ω (E, a) dE − ∂a

or ¯ = Ω dE − ∂H da . d log Ω ¯ ∂a Ω ¯ (E, a) and (2.4.4), obtaining We now insert S(E, a) = k log Ω ∂H 1 dE − da . dS = T ∂a

(2.4.17)

(2.4.18)

From (2.4.18), we can read oﬀ the partial derivatives of the entropy in terms of E and a:8

∂S ∂S 1 1 ∂H ; . (2.4.19) = =− ∂E a T ∂a E T ∂a Introduction of the pressure (special case: a = V ): After the preceding considerations, we can turn to the derivation of pressure within the framework of statistical mechanics. We refer to Fig. 2.6 as a guide to this procedure. A movable piston at a distance L from the origin of the coordinate system permits variations in the volume V = LA, where A is the cross-sectional area of the piston. The inﬂuence of the walls of the container is represented by a wall potential. Let the spatial coordinate of the ith particle in the direction perpendicular to the piston be xi . Then the total wall potential is given by Vwall =

N

v(xi − L) .

(2.4.20)

i=1

Fig. 2.6. The deﬁnition of pressure

Here, v(xi − L) is equal to zero for xi < L and is very large for xi ≥ L, so that penetration of the wall by the gas particles is prevented. We then obtain for the force on the molecules 8

` ∂S ´ The symbol ∂E denotes the partial derivative of S with respect to the energy a E, holding a constant, etc.

44

2. Equilibrium Ensembles

F =

Fi =

i

∂v ∂ ∂H . − = v(xi − L) = ∂xi ∂L i ∂L i

(2.4.21)

The pressure is deﬁned as the average force per unit area which the molecules exert upon the wall, from which we ﬁnd using (2.4.21) that F ∂H P ≡− =− (2.4.22) A ∂V In this case, the general relations (2.4.18) and (2.4.19) become dS =

1 (dE + P dV ) T

(2.4.23)

and 1 = T

∂S ∂E

, V

P = T

∂S ∂V

.

(2.4.24)

E

Solving (2.4.23) for dE, we obtain dE = T dS − P dV ,

(2.4.25)

a relation which we will later identify as the First Law of Thermodynamics [for a constant particle number; see Eqs. (3.1.3) and (3.1.3 )]. Comparison with phenomenological thermodynamics gives an additional justiﬁcation for the identiﬁcation of T with the temperature. As a result of −P dV =

F dV = F dL ≡ δW , A

the last term in (2.4.25) denotes the work δW which is performed on the system causing the change in volume. We are now interested in the pressure distribution in two subsystems, which are separated from each other by a movable partition, keeping the particle numbers in each subsystem constant (Fig. 2.6 ). The energies and volumes are additive E = E1 + E2 ,

V = V1 + V2 .

(2.4.26)

The probability that subsystem 1 has the energy E1 and the volume V1 is given by δ(H1 + H2 − E) δ(H1 − E1 )Θ(q1 ∈ V1 )Θ(q2 ∈ V2 ) ω (E1 , V1 ) = dΓ1 dΓ2 Ω1,2 (E, V ) =

Ω1 (E1 , V1 )Ω2 (E2 , V2 ) . Ω1,2 (E, V )

(2.4.27a)

2.4 Temperature and Pressure

45

Fig. 2.6 . Two systems which are isolated from the external environment, separated by a movable wall which permits the exchange of energy.

In (2.4.27a), the function Θ(q1 ∈ V1 ) means that all the spatial coordinates of the sub-phase space 1 are limited to the volume V1 and correspondingly, Θ(q2 ∈ V2 ). Here, both E1 and V1 are statistical variables, while in (2.4.2b), V1 was a ﬁxed parameter. Therefore, the normalization factor is given here by Ω1,2 (E, V ) = dE1 dV1 Ω1 (E1 , V1 )Ω2 (E − E1 , V − V1 ) . (2.4.27b) In analogy to (2.4.3), the most probable state of the two systems is found by the condition of vanishing derivatives of (2.4.27a) ∂ω (E1 , V1 ) =0 ∂E1

and

∂ω (E1 , V1 ) =0. ∂V1

From this, it follows that ∂ ∂ log Ω1 (E1 , V1 ) = log Ω2 (E2 , V2 ) ⇒ T1 = T2 ∂E1 ∂E2 and

(2.4.28) ∂ ∂ log Ω1 (E1 , V1 ) = log Ω2 (E2 , V2 ) ⇒ P1 = P2 . ∂V1 ∂V2

In systems which are separated by a movable wall and can exchange energy, the equilibrium temperatures and pressures are equal. The microcanonical density matrix evidently depends on the energy E and on the volume V , as well as on the particle number N . If we regard these parameters likewise as variables, then the overall variation of S must be replaced by 1 P µ dE + dV − dN . T T T Here, we have deﬁned the chemical potential µ by dS =

(2.4.29)

∂ µ =k log Ω (E, V, N ) . (2.4.30) T ∂N The chemical potential is related to the fractional change in the number of accessible states with respect to the change in the number of particles. Physically, its meaning is the change in energy per particle added to the system, as can be seen from (2.4.29) by solving that expression for dE.

46

2. Equilibrium Ensembles

2.5 Thermodynamic Properties of Some Non-interacting Systems Now that we have introduced the thermodynamic concepts of temperature and pressure, we are in a position to discuss further the examples of a classical ideal gas, quantum-mechanical oscillators, and non-interacting spins treated in Sect. 2.2.2. In the following, we will derive the thermodynamic consequences of the phase-space surface or number of states Ω (E) which we calculated there for those examples. 2.5.1 The Ideal Gas We ﬁrst calculate the thermodynamic quantities introduced in the preceding sections for the case of an ideal gas. In (2.2.16), we found the phase-space volume in the limit of a large number of particles: ¯ (E) ≡ Ω

dΓ Θ E − H(q, p) =

V N

N

4πmE 3N h2

3N 2 e

5N 2

.

(2.2.16)

If we insert (2.2.16) into (2.3.6), we obtain the entropy as a function of the energy and the volume:

3 V 4πmE 2 5 (2.5.1) S(E, V ) = kN log e2 . N 3N h2 Eq. (2.5.1) is called the Sackur–Tetrode equation. It represents the starting point for the calculation of the temperature and the pressure. The temperature is, from (2.4.4), deﬁned of the partial energy derivative of the reciprocal ∂S as 3 −1 the entropy, T −1 = ∂E = kN E , from which the caloric equation of 2 V state of the ideal gas follows immediately: E=

3 N kT . 2

(2.5.2)

With (2.5.2), we can also ﬁnd the entropy (2.5.1) as a function of T and V :

3 V 2πmkT 2 5 S(T, V ) = kN log (2.5.3) e2 . N h2 The pressure is obtained from (2.4.24) by taking the volume derivative of (2.5.1)

∂S kT N P =T . (2.5.4) = ∂V E V

2.5 Properties of Some Non-interacting Systems

47

This is the thermal equation of state of the ideal gas, which is often written in the form P V = N kT .

(2.5.4 )

The implications of the thermal equation of state are summarized in the diagrams of Fig. 2.7: Fig. 2.7a shows the P V T surface or surface of the equation of state, i.e. the pressure as a function of V and T . Figs. 2.7b,c,d are projections onto the P V -, the T V - and the P T -planes. In these diagrams, the isotherms (T = const), the isobars (P = const), and the isochores (V = const) are illustrated. These curves are also drawn in on the P V T surface (Fig. 2.7a). Remarks: (i) It can be seen from (2.5.2) that the temperature increases with the energy content of the ideal gas, in accord with the usual concept of temperature. (ii) The equation of state (2.5.4) also provides us with the possibility of measuring the temperature. The determination of the temperature of an ideal gas can be achieved by measuring its volume and its pressure.

Fig. 2.7. The equation of state of the ideal gas: (a) surface of the equation of state, (b) P -V diagram, (c) T -V diagram, (d) P -T diagram

48

2. Equilibrium Ensembles

The temperature of any given body can be determined by bringing it into thermal contact with an ideal gas and making use of the fact that the two temperatures will equalize [Eq. (2.4.5)]. The relative sizes of the two systems (body and thermometer) must of course be chosen so that contact with the ideal gas changes the temperature of the body being investigated by only a negligible amount. ∗

2.5.2 Non-interacting Quantum Mechanical Harmonic Oscillators and Spins 2.5.2.1 Harmonic Oscillators From (2.2.31) and (2.3.6), it follows for the entropy of non-coupled harmonic oscillators with e = E/N , that – » e + 12 ω e + 12 ω e − 12 ω e − 12 ω , (2.5.5) S(E) = kN log − log ω ω ω ω where a logarithmic term has been neglected. From Eq. (2.4.4), we obtain for the temperature „ T =

∂S ∂E

«−1 =

ω k

From this, it follows via

„ log

e + 12 ω e − 12 ω

E+ 1 Nω 2

E− 1 Nω 2

«−1 .

(2.5.6)

ω

= e kT that the energy as a function of the

temperature is given by ﬀ j 1 1 + . E = N ω 2 eω/kT − 1

(2.5.7)

The energy increases monotonically with the temperature (Fig. 2.8). Limiting cases: For E → N ω (the minimal energy), we ﬁnd 2 T →

1 =0, log ∞

(2.5.8a)

and for E → ∞ T →

1 =∞. log 1

We can also see that for T → 0, the heat capacity tends to zero: CV = this is in agreement with the Third Law of Thermodynamics.

(2.5.8b) ` ∂E ´ ∂T

V

→ 0;

2.5.2.2 A Paramagnetic Spin- 21 System Finally, we consider a system of N magnetic moments with spin 12 which do not interact with each other; or, more generally, a system of non-interacting two-level systems. We refer here to Sect. 2.2.3.2. From (2.2.35), the entropy of such a system is given by

2.5 Properties of Some Non-interacting Systems

49

Fig. 2.8. Non-coupled harmonic oscillators: the energy as a function of the temperature.

S(E) =

kN 2

j −(1 + e) log

1+e 1−e − (1 − e) log 2 2

ﬀ

with e = E/N h. From this, we ﬁnd for the temperature: „ «−1 „ «−1 ∂S 2h 1−e T = = . log ∂E k 1+e

(2.5.9)

(2.5.10)

The entropy is shown as a function of the energy in Fig. 2.9, and the temperature as a function of the energy in Fig. 2.10. The ground-state energy is E0 = −N h. For E → −N h, we ﬁnd from (2.5.10) lim

E→−Nh

T =0.

(2.5.11)

The temperature increases with increasing energy beginning at E0 = −N h monotonically until E = 0 is reached; this is the state in which the magnetic moments are completely disordered, i.e. there are just as many oriented parallel as antiparallel to the applied magnetic ﬁeld h. The region E > 0, in which the temperature is negative (!), will be discussed later in Sect. 6.7.2.

Fig. 2.9. The entropy as a function of the energy for a two-level system (spin− 12 −paramagnet)

Fig. 2.10. The temperature as a function of the energy for a two-level system (spin− 12 −paramagnet)

50

2. Equilibrium Ensembles

2.6 The Canonical Ensemble In this section, the properties of a small subsystem 1 which is embedded in a large system 2, the heat bath,9 will be investigated (Fig. 2.11). We ﬁrst need to construct the density matrix, which we will derive from quantum mechanics in the following section. The overall system is taken to be isolated, so that it is described by a microcanonical ensemble.

Fig. 2.11. A canonical ensemble. Subsystem 1 is in contact with the heat bath 2. The overall system is isolated.

2.6.1 The Density Matrix The Hamiltonian of the total system H = H1 + H2 + W ≈ H1 + H2

(2.6.1)

is the sum of the Hamiltonians H1 and H2 for systems 1 and 2 and the interaction term W . The latter is in fact necessary so that the two subsystems can come to equilibrium with each other; however, W is negligibly small compared to H1 and H2 . Our goal is the derivation of the density matrix for subsystem 1 alone. We will give two derivations here, of which the second is shorter, but the ﬁrst is more useful for the introduction of the grand canonical ensemble in the next section. (i) Let PE1n be the probability that subsystem 1 is in state n with an energy eigenvalue E1n . Then for PE1n , using the microcanonical distribution for the total system, we ﬁnd PE1n =

1 Ω2 (E − E1n ) = . Ω1,2 (E)∆ Ω1,2 (E)

(2.6.2)

The sum runs over all the states of subsystem 2 whose energy E2n lies in the interval E − E1n ≤ E2n ≤ E + ∆ − E1n . In the case that subsystem 1 is very much smaller than subsystem 2, we can expand the logarithm of Ω2 (E −E1n ) in E1n : 9

A heat bath (or thermal reservoir) is a system which is so large that adding or subtracting a ﬁnite amount of energy to it does not change its temperature.

2.6 The Canonical Ensemble

˜1 + E ˜1 − E1n ) Ω2 (E − E Ω1,2 (E) ˜1 ) ˜ Ω2 (E − E e(E1 −E1n )/kT = Z −1 e−E1n /kT . ≈ Ω1,2 (E)

51

PE1n =

(2.6.3)

−1 ∂ This expression contains T = k ∂E log Ω2 (E − E˜1 ) , the temperature of the heat bath. The normalization factor Z, from (2.6.3), is given by Z=

Ω1,2 (E) −E˜1 /kT . e ˜2 ) Ω2 (E

(2.6.4)

However, it is important that Z can be calculated directly from the properties of subsystem 1. The condition that the sum over all the PE1n must be equal to 1 implies that Z= e−E1n /kT = Tr 1 e−H1 /kT . (2.6.5) n

Z is termed the partition function. The canonical density matrix is then given by the following equivalent representations ρC = PE1n |n n| = Z −1 e−E1n /kT |n n| = Z −1 e−H1 /kT . (2.6.6) n

n

(ii) The second derivation starts with the fact that the density matrix ρ for subsystem 1 can be obtained form the microcanonical density matrix by taking the trace over the degrees of freedom of system 2: δ(H1 + H2 − E) Ω2 (E − H1 ) = Ω1,2 (E) Ω1,2 (E) ˜1 ) ˜ Ω2 (E − E˜1 + E˜1 − H1 ) Ω2 (E − E ≡ ≈ e(E1 −H1 )/kT . Ω1,2 (E) Ω1,2 (E)

ρC = Tr 2 ρM C = Tr 2

(2.6.7)

This derivation is valid both in classical physics and in quantum mechanics, as is shown speciﬁcally in (2.6.9). Thus we have also demonstrated the validity of (2.6.6) with the deﬁnition (2.6.5) by this second route. Expectation values of observables A which act only on the states of subsystem 1 are given by A = Tr 1 Tr 2 ρM C A = Tr 1 ρC A .

(2.6.8)

Remarks: (i) The classical distribution function: The classical distribution function of subsystem 1 is obtained by integration of ρM C over Γ2

52

2. Equilibrium Ensembles

ρC (q1 , p1 ) =

dΓ2 ρM C

1 δ E − H1 (q1 , p1 ) − H2 (q2 , p2 ) dΓ2 Ω1,2 (E) Ω2 E − H1 (q1 , p1 ) . = Ω1,2 (E) =

(2.6.9)

If we expand the logarithm of this expression with respect to H1 , we obtain ρC (q1 , p1 ) = Z −1 e−H1 (q1 ,p1 )/kT Z = dΓ1 e−H1 (q1 ,p1 )/kT .

(2.6.10a) (2.6.10b)

Here, Z is called the partition function. Mean values of observables A(q1 , p1 ) which refer only to subsystem 1 are calculated in the classical case by means of A = dΓ1 ρC (q1 , p1 )A(q1 , p1 ) , (2.6.10c) as one ﬁnds analogously to (2.6.8). (ii) The energy distribution: The energy distribution ω (E1 ) introduced in Sect. 2.4.1 can also be calculated classically and quantum mechanically within the framework of the canonical ensemble (see problem 2.7): 1 ω (E1 ) = ∆1

E1 Z+∆1

dE1

E1

X

δ(E1 − E1n )PE1n

n

Ω2 (E − E1 ) 1 X Ω2 (E − E1 )Ω1 (E1 ) ≈ 1= . Ω1,2 (E) ∆1 n Ω1,2 (E)

(2.6.11)

This expression agrees with (2.4.2a). (iii) The partition function (2.6.5) can also be written as follows: Z Z Z = dE1 Tr 1 e−H1 /kT δ(H1 − E1 ) = dE1 Tr 1 e−E1 /kT δ(H1 − E1 ) Z (2.6.12) = dE1 e−E1 /kT Ω1 (E1 ) .

(iv) In the derivation of the canonical density matrix, Eq. (2.6.7), we expanded the logarithm of Ω2 (E − H1 ).We show that it was justiﬁed to terminate this expansion after the ﬁrst term of the Taylor series: ˜1 − (H1 − E ˜1 )) Ω2 (E − H1 ) = Ω2 (E − E ˜1 )e = Ω2 (E − E ˜1 )e = Ω2 (E − E

1 ˜1 )+ 1 − kT (H1 −E 2

“

∂1/T ˜ ∂E 2

1 ˜1 )− 1 − kT (H1 −E 2kT 2

” ˜ 1 )2 +... (H1 −E

∂T ˜ ∂E 2

˜1 )2 +... (H1 −E

1 ˜1 )(1+ 1 (H1 −E ˜ 1 )+...) (H1 −E ˜1 )e− kT 2T C = Ω2 (E − E ,

2.6 The Canonical Ensemble

53

where C is the heat capacity of the thermal bath. Since, owing to the large size ˜1 ) T C holds (to be regarded as an inequality of the thermal bath, (H1 − E for the eigenvalues), it is in fact justiﬁed to ignore the higher-order corrections in the Taylor expansion. (v) In later sections, we will be interested only in the (canonical) subsystem 1. The heat bath 2 enters merely through its temperature. We shall then leave oﬀ the index ‘1’ from the relations derived in this section. 2.6.2 Examples: the Maxwell Distribution and the Barometric Pressure Formula Suppose the subsystem to consist of one particle. The probability that its position and its momentum take on the values x and p is given by: p2 w(x, p) d3 x d3 p = C e−β 2m +V (x) d3 x d3 p . (2.6.13) 1 Here, β = kT and V (x) refers to the potential energy, while C = C C is a normalization factor10 . Integration over spatial coordinates gives the momentum distribution p2

w(p) d3 p = C e−β 2m d3 p .

(2.6.14)

If we do not require the direction of the momentum, i.e. integrating over all angles, we obtain p2

w(p) dp = 4πC e−β 2m p2 dp ;

(2.6.15)

this is the Maxwell velocity distribution. Integration of (2.6.13) over the momentum gives the spatial distribution: w(x) d3 x = C e−βV (x) d3 x .

(2.6.16)

If we now set the potential V (x) equal to the gravitational ﬁeld V (x) = mgz and use the fact that the particle-number density is proportional to w(x), we obtain [employing the equation of state for the ideal gas, (2.5.4 ), which relates the pressure to the particle-number density] an expression for the altitude dependence of the pressure, the barometric pressure formula: P (z) = P0 e−mgz/kT

(2.6.17)

(cf. also problem 2.15). 10

C =

`

´3/2 β 2πm

and C =

“R

d3 x e−βV (x)

”−1

54

2. Equilibrium Ensembles

2.6.3 The Entropy of the Canonical Ensemble and Its Extremal Values From Eq. (2.6.6), we ﬁnd for the entropy of the canonical ensemble SC = −klog ρC =

1 ¯ E + k log Z T

(2.6.18)

with ¯ = H . E

(2.6.18 )

Now let ρ correspond to a diﬀerent distribution with the same average energy ¯ then the inequality H = E; S[ρ] = −k Tr (ρ log ρ) ≤ −k Tr ρ log ρC H (2.6.19) 1 − log Z = H + k log Z = SC = −k Tr ρ − kT T results. Here, the inequality in (2.3.3) was used along with ρ1 = ρC . The canonical ensemble has the greatest entropy of all ensembles with the same average energy. 2.6.4 The Virial Theorem and the Equipartition Theorem 2.6.4.1 The Classical Virial Theorem and the Equipartition Theorem Now, we consider a classical system and combine its momenta and spatial ∂H coordinates into xi = pi , qi . For the average value of the quantity xi ∂x we j ﬁnd the following relation: ∂H ∂H −H/kT −1 =Z dΓ xi e xi ∂xj ∂xj ∂e−H/kT = Z −1 dΓ xi (−kT ) = kT δij , (2.6.20) ∂xj where we have carried out an integration by parts. We have assumed that exp(−H(p, q)/kT ) drops oﬀ rapidly enough for large p and q so that no boundary terms occur. This is the case for the kinetic energy and potentials such as those of harmonic oscillators. In the general case, one would have to take the wall potential into account. Eq. (2.6.20) contains the classical virial theorem as a special case, as well as the equipartition theorem. Applying (2.6.20) to the spatial coordinates qi , we obtain the classical virial theorem ∂V qi = kT δij . (2.6.21) ∂qj

2.6 The Canonical Ensemble

55

We now specialize to the case of harmonic oscillators, i.e. V =

Vi ≡

i

mω 2 2

i

qi2 .

(2.6.22)

For this case, it follows from (2.6.21) that Vi =

kT . 2

(2.6.23)

The potential energy of each degree of freedom has the average value kT /2. Applying (2.6.20) to the momenta, we ﬁnd the equipartition theorem. We take as the kinetic energy the generalized quadratic form Ekin = aik pi pk , with aik = aki . (2.6.24) i,k

kin For this form, we ﬁnd ∂E k (aik pk +aki pk ) = k 2aik pk and therewith, ∂pi = after multiplication by pi and summation over all i,

pi

i

∂Ekin = 2aik pi pk = 2Ekin . ∂pi

(2.6.25)

k

Now we take the thermal average and ﬁnd from (2.6.20) ∂H = 2Ekin = 3 N kT ; pi ∂pi i

(2.6.26)

i.e. the equipartition theorem. The average kinetic energy per degree of freedom is equal to 12 kT . As previously mentioned, in the potential V , the interaction 1 m,n v(|xmn |) (with xmn = xm − xn ) of the particles with each other and 2 in general their interaction with the wall, Vwall , must be taken into account. Then using (2.6.23) and (2.6.25), we ﬁnd 2 1 ∂v(|xmn |) xmn . (2.6.27) P V = Ekin − 3 6 m,n ∂xmn The term P V results from the wall potential. The second term on the righthand side is called the ‘virial’ and can be expanded in powers of N V (virial expansion, see Sect. 5.3). ∗

Proof of (2.6.27): We begin with the Hamiltonian H=

X p2n 1X v(xn − xm ) + Vwall , + 2m 2 n,m n

(2.6.28)

56

2. Equilibrium Ensembles

Fig. 2.12. Quantities related to the wall potential and the pressure: increasing the volume on displacing a wall by δL1

and write for the pressure, using (2.4.22): PV = −

D ∂H E V D ∂H E ∂H ∂H E 1 D ∂H V =− . (2.6.29) = − L1 + L2 + L3 ∂V ∂L1 L2 L3 3 ∂L1 ∂L2 ∂L3

Now, Vwall has the form (cf. Fig. 2.12) X˘ ¯ Θ(xi1 − L1 ) + Θ(xi2 − L2 ) + Θ(xi3 − L3 ) . Vwall = V∞

(2.6.30)

i

energy of Here, V∞ characterizes the barrier represented by the wall. The kinetic P wall the particles is much smaller than V∞ . Evidently, ∂V = −V∞ n δ(xn1 − L1 ) ∂L1 and therefore DX E DX E ∂Vwall E D X = xn1 xn1 V∞ δ(xn1 − L1 ) = L1 V∞ δ(xn1 − L1 ) ∂xn1 n n n E D ∂H E D ∂V wall = − L1 . = − L1 ∂L1 ∂L1 With this, (2.6.29) can be put into the form E E 1D X ∂ ∂ 1DX xnα Vwall = kT N − xnα v 3 n,α ∂xnα 3 n,α ∂xnα D E X X ¸ ˙ 1 2 ∂v = (xnα − xmα ) Ekin − . 3 6 α ∂(xnα − xmα )

PV =

(2.6.31) (2.6.32)

n=m

In the ﬁrst line, the virial theorem (2.6.21) was used, and we have abbreviated the sum of the pair potentials as v. In the second line, kT was substituted by (2.6.26) and the derivative of the pair potentials was written out explicitly, whereby for example « „ ∂v(x1 − x2 ) ∂ ∂ + x2 v(x1 − x2 ) = (x1 − x2 ) x1 ∂x1 ∂x2 ∂(x1 − x2 ) was used, and x1 (x2 ) refers to the x component of particle 1(2). With (2.6.32), we have proven (2.6.27).

2.6 The Canonical Ensemble ∗

57

2.6.4.2 The Quantum-Statistical Virial Theorem

Starting from the Hamiltonian H=

p2 1 n + V (xn − xwall ) + v(xn − xm ) , 2m 2 n,m n n

(2.6.33)

it follows that11

[H, xn · pn ] = −i

p2n − xn · ∇n V (xn − xwall ) m xn · ∇n v(xn − xm ) . (2.6.34) − n =m

Now, ψ| [H, n xn · pn ] |ψ = 0 for energy eigenstates. We assume the density matrix to be diagonal in the basis of the energy eigenstates; from this, it follows that 2 Ekin − xn · ∇n V (xn − xwall ) n

−

xn · ∇n v(xn − xm ) = 0 . (2.6.35)

n m =n

With (2.6.31), we again obtain the virial theorem immediately 1 2 Ekin − 3P V − (xn − xm ) · ∇v(xn − xm ) = 0 . 2 n m

(2.6.27)

Eq. (2.6.27) is called the virial theorem of quantum statistics. It holds both classically and quantum mechanically, while (2.6.21) and (2.6.26) are valid only classically. From the virial theorem (2.6.27), we ﬁnd for ideal gases: PV =

2 m 2 1 2 Ekin = vn = mN v2 . 3 3 n 2 3

(2.6.36)

For 2 non-interacting classical particles, the mean squared velocity per particle, v , can be computed using the Maxwell velocity distribution; then from (2.6.36), one again obtains the well-known equation of state of the classical ideal gas. 11

See e.g. QM I, p. 218.

58

2. Equilibrium Ensembles

2.6.5 Thermodynamic Quantities in the Canonical Ensemble 2.6.5.1 A Macroscopic System: The Equivalence of the Canonical and the Microcanonical Ensemble We assume that the smaller subsystem is also a macroscopic system. Then it follows from the preceding considerations on the width of the energy distribution function ω (E1 ) that the average value of the energy E¯1 is equal to ˜1 , i.e. the most probable value E ¯1 = E ˜1 . E

(2.6.37)

We now wish to investigate how statements about thermodynamic quantities in the microcanonical and the canonical ensembles are related. To this end, we rewrite the partition function (2.6.4) in the following manner: Z=

Ω1,2 (E) ˜ ˜1 )−1 Ω1 (E˜1 )e−E˜1 /kT . Ω (E˜ )e−E1 /kT = ω(E ˜1 )Ω2 (E − E ˜1 ) 1 1 Ω1 (E (2.6.38)

According to (2.4.8), the typical N1 -dependence of ω (E1 ) is given by −1

ω (E1 ) ∼ N1 2 e− 4 (E1 −E1 ) 3

˜

2

/N1 e¯2

,

(2.6.39)

with the normalization factor determined by the condition dE1 ω (E1 ) = 1. From (2.4.14), the N1 -dependence takes the form of Eq. (2.6.39) even for interacting systems. We thus ﬁnd from (2.6.38) that ˜ ˜ 1 ) N1 . Z = e−E1 /kT Ω1 (E (2.6.40) Inserting this result into Eq. (2.6.18), we obtain the following expression for the canonical entropy [using (2.6.37) and neglecting terms of the order of log N1 ]: SC =

1¯ ˜1 ) = SM C (E ˜1 ) . E1 − E˜1 + kT log Ω1 (E T

(2.6.41)

From (2.6.41) we can see that the entropy of the canonical ensemble is equal ˜1 (= E ¯1 ). In both to that of a microcanonical ensemble with the energy E ensembles, one obtains identical results for the thermodynamic quantities. 2.6.5.2 Thermodynamic Quantities We summarize here how various thermodynamic quantities can be calculated for the canonical ensemble. Since the heat bath enters only through its temperature T , we leave oﬀ the index 1 which indicates subsystem 1. Then for the canonical density matrix, we have ρC = e−βH /Z

(2.6.42)

2.6 The Canonical Ensemble

59

with the partition function Z = Tr e−βH , where we have used the deﬁnition β =

(2.6.43) 1 kT

. We also deﬁne the free energy

F = −kT log Z .

(2.6.44)

For the entropy, we obtain from (2.6.18) SC =

1¯ E + kT log Z . T

(2.6.45)

The average energy is given by ¯ = H = − ∂ log Z = kT 2 ∂ log Z . E ∂β ∂T The pressure takes the form: ∂H ∂ log Z P =− = kT . ∂V ∂V

(2.6.46)

(2.6.47)

The derivation from Sect. 2.4.3, which gave − ∂H for the pressure, is of ∂V course still valid for the canonical ensemble. From Eq. (2.6.45), it follows that F = E¯ − T SC .

(2.6.48)

Since the canonical density matrix contains T and V as parameters, F is likewise a function of these quantities. Taking the total diﬀerential of (2.6.44) by applying (2.6.43), we obtain dT 1 ∂H −βH Tr ( kT 2 H − kT ∂V dV )e −βH dF = −k dT log Tr e − kT −βH Tr e ∂H 1 ¯ dV = − (E + kT log Z)dT + T ∂V and, with (2.6.45)–(2.6.47), dF (T, V ) = −SC dT − P dV .

(2.6.49)

From Eqs. (2.6.48) and (2.6.49) we ﬁnd ¯ = T dSC − P dV . dE

(2.6.50a)

This relation corresponds to (2.4.25) in the microcanonical ensemble. In the ¯=E ˜ = E and SC = SM C . limiting case of macroscopic systems, E

60

2. Equilibrium Ensembles

The First Law of thermodynamics expresses the energy balance. The most general change in the energy of a system with a ﬁxed number of particles is composed of the work δW = −P dV performed on the system together with the quantity of heat δQ transferred to it: dE = δQ + δW .

(2.6.50b)

Comparison with (2.6.50a) shows that the heat transferred is given by δQ = T dS

(2.6.50c)

(this is the Second Law for transitions between equilibrium states). The temperature and the volume occur in the canonical partition function and in the free energy as natural variables. The partition function is calculated for a Hamiltonian with a ﬁxed number of particles.12 As in the case of the microcanonical ensemble, however, one can here also treat the partition function or the free energy, in which the particle number is a parameter, as a function of N . Then the total change in F is given by

∂F dF = −SC dT − P dV + dN , (2.6.51) ∂N T,V and it follows from (2.6.48) that

¯ = T dSC − P dV + ∂F dN . dE ∂N T,V

(2.6.52)

In the thermodynamic limit, (2.6.52) and (2.4.29) must agree, so that we ﬁnd

∂F =µ. (2.6.53) ∂N T,V 2.6.6 Additional Properties of the Entropy 2.6.6.1 Additivity of the Entropy We now consider two subsystems in a common heat bath (Fig. 2.13). Assuming that each of these systems contains a large number of particles, the energy is additive. That is, the interaction energy, which acts only at the interfaces, is much smaller than the energy of each of the individual systems. We wish to show that the entropy is also additive. We begin this task with the two density matrices of the subsystems: ρ1 = 12

e−βH1 , Z1

ρ2 =

e−βH2 . Z2

(2.6.54a,b)

Exceptions are photons and bosonic quasiparticles such as phonons and rotons in superﬂuid helium, for which the particle number is not ﬁxed (Chap. 4).

2.6 The Canonical Ensemble

61

Fig. 2.13. Two subsystems 1 and 2 in one heat bath

The density matrix of the two subsystems together is ρ = ρ1 ρ 2 ,

(2.6.54c)

where once again W H1 , H2 was employed. From log ρ = log ρ1 + log ρ2

(2.6.55)

it follows that the total entropy S is given by S = S1 + S 2 ,

(2.6.56)

the sum of the entropies of the subsystems. Eq. (2.6.56) expresses the fact that the entropy is additive. ∗

2.6.6.2 The Statistical Meaning of Heat

Here, we want to add a few supplementary remarks that concern the statistical and physical meaning of heat transfer to a system. We begin with the average energy ¯ = H = Tr ρH E

(2.6.57a)

for an arbitrary density matrix and its total variation with a ﬁxed number of particles ¯ = Tr dρ H + ρ dH , dE (2.6.57b) where dρ is the variation of the density matrix and dH is the variation of the Hamiltonian (see the end of this section). The variation of the entropy S = −k Tr ρ log ρ

(2.6.58)

is given by ρ dS = −k Tr dρ log ρ + dρ . ρ

(2.6.59)

Now we have Tr dρ = 0 ,

(2.6.60)

62

2. Equilibrium Ensembles

since for all density matrices, Tr ρ = Tr (ρ + dρ) = 1, from which it follows that dS = −k Tr log ρ dρ . (2.6.61) Let the initial density matrix be the canonical one; then making use of (2.6.60), we have dS =

1 Tr (H dρ) . T

(2.6.62)

If we insert this into (2.6.57b) and take the volume as the only parameter in H, i.e. dH = ∂H ∂V dV , we again obtain (cf. (2.6.50a)) ∂H ¯ dV . (2.6.63) dE = T dS + ∂V We shall now discuss the physical meaning of the general relation (2.6.57b): 1st term: this represents a change in the density matrix, i.e. a change in the occupation probabilities. 2nd term: the change of the Hamiltonian. This means a change in the energy as a result of inﬂuences which change the energy eigenvalues of the system. Let ρ be diagonal in the energy eigenstates; then ¯= E pi Ei , (2.6.64) i

and the variation of the average energy has the form ¯= dE dpi Ei + pi dEi . i

(2.6.65)

i

Thus, the quantity of heat transferred is given by δQ = dpi Ei .

(2.6.66)

i

A transfer of heat gives rise to a redistribution of the occupation probabilities of the states |i. Heating (heat input) increases the populations of the states at higher energies. Energy change by an input of work (work performed on the system) produces a change in the energy eigenvalues. In this process, the occupation numbers can change only in such a way as to keep the entropy constant. When only the external parameters are varied, work is performed on the system, but no heat is put into it. In this case, although dρ may exhibit a change, there is no change in the entropy. This can be shown explicitly as follows: From Eq. (2.6.61), we have dS = −kTr (log ρdρ). It then follows from the von Neumann Eq. (1.4.8), ρ˙ = i [ρ, H(V (t))], which is valid also

2.7 The Grand Canonical Ensemble

63

for time-dependent Hamiltonians, e.g. one containing the volume V (t): S˙ = −k Tr log ρ ρ˙ (2.6.67) ik ik = − Tr log ρ [ρ, H] = − Tr H [log ρ, ρ] = 0 . The entropy does not change, and no heat is put into the system. An example which demonstrates this situation is the adiabatic reversible expansion of an ideal gas (Sect. 3.5.4.1). There, as a result of the work performed, the volume of the gas changes and with it the Hamilton function; furthermore, the temperature of the gas changes. These eﬀects together lead to a change in the distribution function (density matrix), but however not of the entropy.

2.7 The Grand Canonical Ensemble 2.7.1 Systems with Particle Exchange After considering systems in the preceding section which can exchange energy with a heat bath, we now wish to allow in addition the exchange of matter between subsystem 1 on the one hand and the heat bath 2 on the other; this will be a consistent generalization of the canonical ensemble (see Fig. 2.14). The overall system is isolated. The total energy, the total particle number and the overall volume are the sums of these quantities for the subsystems: E = E1 + E2 ,

N = N1 + N2 ,

V = V1 + V2 .

(2.7.1)

Fig. 2.14. Regarding the grand canonical ensemble: two subsystems 1 and 2, between which energy and particle exchange is permitted.

The probability distribution of the state variables E1 , N1 , and V1 of subsystem 1 is found in complete analogy to Sect. 2.4.3, ω (E1 , N1 , V1 ) =

Ω1 (E1 , N1 , V1 ) Ω2 (E − E1 , N − N1 , V − V1 ) . Ω (E, N, V )

(2.7.2)

The attempt to ﬁnd the maximum of this distribution leads again to equality of the logarithmic derivatives, in this case with respect to E, V and N . The

64

2. Equilibrium Ensembles

ﬁrst two relations were already seen in Eq. (2.4.28) and imply temperature and pressure equalization between the two systems. The third formula can be expressed in terms of the chemical potential which was deﬁned in (2.4.29):

∂ ∂S µ = −kT log Ω (E, N, V ) = −T , (2.7.3) ∂N ∂N E,V and we obtain ﬁnally as a condition for the maximum probability the equalization of temperature, pressure, and chemical potential: T1 = T2 ,

P1 = P2 ,

µ1 = µ2 .

(2.7.4)

2.7.2 The Grand Canonical Density Matrix Next, we will derive the density matrix for the subsystem. The probability that in system 1 there are N1 particles which are in the state |n at the energy E1n (N1 ) is given by:

p(N1 , E1n (N1 ), V1 ) =

E−E1n (N1 )≤E2m (N2 )≤E−E1n (N1 )+∆

=

1 Ω (E, N, V )∆

Ω2 (E − E1n , N − N1 , V2 ) . Ω (E, N, V ) (2.7.5)

In order to eliminate system 2, we carry out an expansion in the variables E1n and N1 with the condition that subsystem 1 is much smaller than subsystem 2, analogously to the case of the canonical ensemble: −1 −(E1n −µN1 )/kT p(N1 , E1n (N1 ), V1 ) = ZG e .

(2.7.6)

We thus obtain the following expression for the density matrix of the grand canonical ensemble13 : −1 −(H1 −µN1 )/kT ρG = Z G e ,

(2.7.7)

where the grand partition function ZG (or Gibbs distribution) is found from the normalization of the density matrix to be ZG = Tr e−(H1 −µN1 )/kT Tr e−H1 /kT +µN1 /kT = Z(N1 ) eµN1 /kT . (2.7.8) = N1

13

N1

See also the derivation in second quantization, p. 69

2.7 The Grand Canonical Ensemble

65

The two trace operations Tr in Eq. (2.7.8) refer to diﬀerent spaces. The trace after the second equals sign refers to a summation over all the diagonal matrix elements for a ﬁxed particle number N1 , while the Tr after the ﬁrst equals sign implies in addition the summation over all particle numbers N1 = 0, 1, 2, . . .. The average value of an operator A in the grand canonical ensemble is A = Tr (ρG A) , where the trace is here to be understood in the latter sense. In classical statistics, remains unchanged for the distribution func (2.7.7) tion, while Tr −→ dΓ N1 must be replaced by the 6N1 -dimensional N1 dq dp operator dΓN1 = h3N1 N1 ! . −1 From (2.7.5), ZG can also be given in terms of −1 ZG =

Ω2 (E, N, V − V1 ) = e−P V1 /kT Ω (E, N, V )

(2.7.9)

for V1 V ; recall Eqns. (2.4.24) and (2.4.25).

From the density matrix, we ﬁnd the entropy of the grand canonical ensemble, SG = −klog ρG =

1 ¯ ¯ ) + k log ZG . (E − µN T

(2.7.10)

Since the energy and particle reservoir, subsystem 2, enters only via its temperature and chemical potential, we dispense with the index 1 here and in the following sections. The distribution function for the energy and the particle number is extremely narrow for macroscopic subsystems. The relative ﬂuctuations are proportional to the square root of the average number of particles. There¯ = E ˜ and N ¯ = N ˜ for macroscopic subsystems. The grand fore, we have E canonical entropy, also, may be shown (cf. Sect. 2.6.5.1) in the limit of macroscopic subsystems to be identical with the microcanonical entropy, taken at the most probable values (with ﬁxed volume V1 ) ˜1 = E ¯1 , E

˜1 = N ¯1 N

˜1 ) . SG = SM C (E˜1 , N

(2.7.11) (2.7.12)

2.7.3 Thermodynamic Quantities In analogy to the free energy of the canonical ensemble, the grand potential is deﬁned by Φ = −kT log ZG ,

(2.7.13)

66

2. Equilibrium Ensembles

from which with (2.7.10) we obtain the expression ¯ − T SG − µN ¯ . Φ (T, µ, V ) = E The total diﬀerential of the grand potential is given by

∂Φ ∂Φ ∂Φ dΦ = dT + dV + dµ . ∂T V,µ ∂V T,µ ∂µ V,T

(2.7.14)

(2.7.15)

The partial derivatives follow from (2.7.13) and (2.7.8):

∂Φ ∂T ∂Φ ∂V

1 1 ¯ + µN ¯ ) = −SG H − µN = (Φ − E 2 kT T

∂Φ ∂H 1 ¯ . = −P , N = −N = = −kT ∂V ∂µ T,V kT = −k log ZG − kT

V,µ

T,µ

(2.7.16) If we insert (2.7.16) into (2.7.15), we ﬁnd ¯ dΦ = −SG dT − P dV − Ndµ .

(2.7.17)

From this, together with (2.7.14), it follows that ¯ = T dSG − P dV + µdN ¯ ; dE

(2.7.18)

this is again the First Law. As shown above, for macroscopic systems we can use simply E, N and S in (2.7.17) and (2.7.18) instead of the average values of the energy and the particle number and SG ; we shall do this in later chapters. For a constant particle number, (2.7.18) becomes identical with (2.4.25). The physical meaning of the First Law will be discussed in detail in Sect. 3.1. We have considered the ﬂuctuations of physical quantities thus far only in Sect. 2.4.2. Of course, we could also calculate the autocorrelation function for energy and particle number in the grand canonical ensemble. This shows that these quantities are extensive and their relative ﬂuctuations decrease inversely as the square root of the size of the system. We shall postpone these considerations to the chapter on thermodynamics, since there we can relate the correlations to thermodynamic derivatives. We close this section with a tabular summary of the ensembles treated in this chapter. Remark concerning Table 2.1: The thermodynamic functions which are found from the logarithm of the normalization factors are the entropy and the thermodynamic potentials F and Φ (see Chap. 3). The generalization to several diﬀerent types of particles will be carried out in Chap. 5. To this end, one must merely replace N by {Ni } and µ by {µi }.

2.7 The Grand Canonical Ensemble

67

Table 2.1. The most important ensembles Ensemble Physical tion

microcanonical

canonical

grand canonical

isolated

energy exchange

energy and particle exchange

× δ(H − E)

1 e−H/kT Z(T,V,N)

1 × ZG (T,V,µ) −(H−µN)/kT

Ω (E, V, N ) = Tr δ(H − E)

Z(T, V, N ) = Tr e−H/kT

ZG (T, V, µ) = Tr e−(H−µN)/kT

E, V, N

T, V, N

T, V, µ

S

F

Φ

situa-

1 Ω (E,V,N)

Density matrix Normalization Independent variables Thermodynamic functions

e

2.7.4 The Grand Partition Function for the Classical Ideal Gas As an example, we consider the special case of the classical ideal gas. 2.7.4.1 Partition Function For the partition function for N particles, we obtain P 2 1 dp1 . . . dp3N e−β pi /2m dq . . . dq ZN = 1 3N 3N N! h N

V = N!

V

2mπ βh2

3N 2

1 = N!

V λ3

N

with the thermal wavelength √ λ = h/ 2πmkT .

(2.7.19)

(2.7.20)

Its name results from the fact that a particle of mass m and momentum h/λ will have a kinetic energy of the order of kT . 2.7.4.2 The Grand Partition Function Inserting (2.7.19) into the grand partition function (2.7.8), we ﬁnd

N ∞ ∞ 3 1 βµN V ZG = e eβµN ZN = = ezV /λ , (2.7.21) N! λ3 N =0

N =0

68

2. Equilibrium Ensembles

where the fugacity z = eβµ

(2.7.22)

has been deﬁned. 2.7.4.3 Thermodynamic Quantities From (2.7.13) and (2.7.21), the grand potential takes on the simple form Φ ≡ −kT log ZG = −kT zV /λ3 .

(2.7.23)

From the partial derivatives, we can compute the thermodynamic relations.14 Particle number

∂Φ = zV /λ3 (2.7.24) N =− ∂µ T,V Pressure

P V = −V

∂Φ ∂V

= −Φ = N kT

(2.7.25)

T,µ

This is again the thermal equation of state of the ideal gas, as found in Sect. 2.5. For the chemical potential, we ﬁnd from (2.7.22), (2.7.24), and (2.7.23)

V /N kT kT µ = −kT log = −kT log = kT log P − kT log 3 . (2.7.26) 3 3 λ Pλ λ For the entropy, we ﬁnd

µ V 5 V ∂Φ S=− = kz 3 + kT − 2 z 3 ∂T V,µ 2 λ kT λ

V /N 5 + log 3 , = kN 2 λ

(2.7.27)

and for the internal energy, from (2.7.14), we obtain 5 3 E = Φ + T S + µN = N kT (−1 + ) = N kT . 2 2 14

(2.7.28)

¯ For the reasons mentioned at the end of the preceding section, we replace E ¯ in (2.7.16) and (2.7.17) by E and N . and N

2.7 The Grand Canonical Ensemble

69

∗

2.7.5 The Grand Canonical Density Matrix in Second Quantization The derivation of ρG can be carried out most concisely in the formalism of the second quantization. In addition to the Hamiltonian H, expressed in terms of the ﬁeld operators ψ(x) (see Eq. (1.5.6d) in QM II15 ), we require the particle-number operator, Eq. (1.5.10)15 ˆ = d3 x ψ † (x)ψ(x) . (2.7.29) N V

The microcanonical density matrix for ﬁxed volume V is 1 ˆ − N) . δ(H − E)δ(N (2.7.30) ρM C = Ω(E, N, V ) Corresponding to the division of the overall volume into two subvolumes, V = ˆ =N ˆ1 + N ˆ2 with N ˆi = V1 +V2 , we have H = H1 +H2 and N d3 x ψ † (x)ψ(x), Vi i = 1, 2. We ﬁnd from (2.7.30) the probability that the energy and the particle number in subvolume 1 assume the values E1 and N1 : ω(E1 , V1 , N1 ) 1 ˆ − N )δ(H1 − E1 )δ(N ˆ 1 − N1 ) δ(H − E)δ(N = Tr Ω(E, N, V ) 1 ˆ2 − (N − N1 )) δ(H2 − (E − E1 ))δ(N = Tr Ω(E, N, V ) ˆ 1 − N1 ) ×δ(H1 − E1 )δ(N =

Ω1 (E1 , N1 , V1 )Ω2 (E − E1 , N − N1 , V − V1 ) . Ω(E, N, V )

(2.7.31)

The (grand canonical) density matrix for subsystem 1 is found by taking the trace of the density matrix of the overall system over subsystem 2, with respect to both the energy and the particle number: 1 ˆ − N) ρG = Tr2 δ(H − E)δ(N Ω(E, N, V ) (2.7.32) ˆ1 , V − V1 ) Ω2 (E − H1 , N − N = . Ω(E, N, V ) ˆ1 leads to Expansion of the logarithm of ρG in terms of H1 and N ˆ1 )/kT −1 −(H1 − µN e ρG = Z G ˆ ZG = Tr e−(H1 − µN1 )/kT ,

(2.7.33)

consistent with Equations (2.7.7) and (2.7.8), which were obtained by considering the probabilities. 15

F. Schwabl, Advanced Quantum Mechanics (QM II), 3rd ed., Springer Berlin, Heidelberg, New York 2005. This text will be cited in the rest of this book as QM II.

70

2. Equilibrium Ensembles

Problems for Chapter 2 2.1 Calculate Ω(E) for a spin system which is described by the Hamiltonian H = µB H

N X

Si ,

i=1

where Si can take on the values Si = ±1/2 X 1. Ω(E)∆ = E≤En ≤E+∆

Use a combinatorial method, rather than 2.2.3.2. ˙ ¸

˙ ¸

2.2 For a one-dimensional classical ideal gas, calculate p21 and p41 . Zπ sinm x cosn x dx =

Formula: 0

Γ

` m+1 ´ ` n+1 ´ Γ 2 2 ` ´ . Γ n+m+2 2

2.3 A particle is moving in one dimension; the distance between the walls of the container is changed by a piston at L. Compute the change in the phase-space ¯ = 2Lp (p = momentum). volume Ω (a) For a slow, continuous motion of the piston. (b) For a rapid motion of the piston between two reﬂections of the particle. ¯ 2.4 Assume that the entropy S depends on the volume Ω(E) inside the energy

¯ Show that from the additivity of S and the multiplicative character shell: S = f (Ω). ¯ it follows that S = const × log Ω. ¯ of Ω,

2.5 (a) For a classical ideal gas which is enclosed within a volume V , calculate the free energy and the entropy, starting with the canonical ensemble. (b) Compare them with the results of Sect. 2.2.

2.6 Using the assertion that the entropy S = −k Tr (ρ log ρ) is maximal, show that

¯ for ρ, the canonical density matrix with the conditions Tr ρ = 1 and Tr ρH = E results. Hint: This is a variational problem with constraints, which can be solved using the method of Lagrange multipliers.

2.7 Show that for the energy distribution in the classical canonical ensemble Z ω(E1 ) =

dΓ1 ρK δ(H1 − E1 ) = Ω1 (E1 )

˜2 ) E˜ /kT −E /kT Ω2 (E Ω2 (E − E1 ) Ω1 (E1 ) e 1 ≈ e 1 . Ω1,2 (E) Ω1,2 (E)

(2.7.34)

2.8 Consider a system of N classical, non-coupled one-dimensional harmonic oscillators and calculate for this system the entropy and the temperature, starting from the microcanonical ensemble.

Problems for Chapter 2

71

2.9 Consider again the harmonic oscillators from problem 2.8 and calculate for this system the average value of the energy and the entropy, starting with the canonical ensemble. 2.10 In analogy to the preceding problems, consider N quantum-mechanical noncoupled one-dimensional harmonic oscillators and compute the average value of the ¯ and the entropy, beginning with the canonical ensemble. Also investigate energy E ¯ lim→0 S and limT →0 S, and compare the limiting values you obtain with lim→0 E, the results of problem 2.9. 2.11 For the Maxwell distribution, ﬁnd ¸ ˙ (a) the average value of the nth power of the velocity v n , (b) v, (c) (v − v)2 , ` m ´2 ˙ 2 ˙ 2 ¸ 2 ¸ (d) 2 (v − v ) , and (e) the most probable value of the velocity. 2.12 Determine the number of collisions of a molecule of an ideal gas with the wall of its container per unit area and unit time, when (a) the angle between the normal to the wall and the direction of the velocity lies between Θ and Θ + dΘ; (b) the magnitude of the velocity lies between v and v + dv. 2.13 Calculate the pressure of a Maxwellian gas with the velocity distribution „ f (v) = n

mβ 2π

«3 2

e−

βmv2 2

.

Suggestions: the pressure is produced by reﬂections of the particles from the walls of the container; it is therefore the average force on an area A of wall which acts over a time interval τ . P =

1 τA

Zτ dt Fx (t) . 0

If a particle is reﬂected from the wall with the velocity v, its contribution is given Rτ from Newton’s 2nd axiom in terms of dt Fx (t) by the momentum transferred per 0 P collision, 2mvx . Then P = τ1A 2mvx ,whereby the sum extends over all particles which reach the area A within the time τ . Result: P = nkT .

2.14 A simple model for thermalization: Calculate the average kinetic energy of a particle of mass m1 with the velocity v1 due to contact with an ideal gas consisting of particles of mass m2 . As a simpliﬁcation, assume that only elastic and linear collisions occur. The eﬀect on the ideal gas can be neglected. It is helpful to use the abbreviations M = m1 + m2 and m = m1 − m2 . How many collisions are required until, for m1 = m2 , a temperature equal to the (1 − e−1 )-fold temperature of the ideal gas is attained?

72

2. Equilibrium Ensembles

2.15 Using the canonical ensemble, calculate the average value of the particlenumber density n(x) =

N X

δ(x − xi )

i=1

for an ideal gas which is contained in an inﬁnitely high cylinder of cross-sectional area A in the gravitational ﬁeld of the Earth. The potential energy of a particle in the gravitational ﬁeld is mgh. Also calculate (a) the internal energy of this system, (b) the pressure at the height (altitude) h, using the deﬁnition Z∞ P =

˙ ¸ n(x) mg dz ,

h

(c) the average distance z of an oxygen molecule and a helium atom from the surface of the Earth at a temperature of 0◦ C, and (d) the mean square deviation ∆z for the particles in 2.15c. At this point, we mention the three diﬀerent derivations of the barometric pressure formula, each emphasizing diﬀerent physical aspects, in R. Becker, Theory of Heat, 2nd ed., Sec. 27, Springer, Berlin 1967.

2.16 The potential energy of N non-interacting localized dipoles depends on their orientations relative to an applied magnetic ﬁeld H: H = −µHz

N X

cos ϑi .

i=1

Calculate the partition function and show that the magnetization along the zdirection takes the form Mz =

N DX

E ` ´ µ cos ϑi = N µ L βµHz ;

L(x) = Langevin function .

i=1

Plot the Langevin function. How large is the magnetization at high temperatures? Show that at high temperatures, the Curie law for the magnetic susceptibility holds: „ « ∂Mz ∼ const/T . χ = lim Hz →0 ∂Hz

2.17 Demonstrate the equipartition theorem and the virial theorem making use of the microcanonical distribution. 2.18 In the extreme relativistic P case, the Hamilton function for N particles in three-dimensional space is H = i |pi |c. Compute the expectation value of H with the aid of the virial theorem. 2.19 Starting with the canonical ensemble of classical statistics, calculate the equation of state and the internal energy of a gas composed of N indistinguishable particles with the kinetic energy ε(p) = |p| · c.

Problems for Chapter 2

73

2.20 Show that for an ideal gas, the probability of ﬁnding a subsystem in the grand canonical ensemble with N particles is given by the Poisson distribution: pN =

1 −N¯ ¯ N N , e N!

¯ is the average value of N in the ideal gas. where N ¯. Suggestions: Start from pN = eβ(Φ+Nµ) ZN . Express Φ, µ, and ZN in terms of N

2.21 (a) Calculate the grand partition function for a mixture of two ideal gases (2 chemical potentials!). (b) Show that ´ ` and P V = N1 + N2 kT ´ 3` E= N1 + N2 kT 2 are valid, where N1 , N2 and E are the average particle number and the average energy.

2.22 (a) Express E¯ by taking an appropriate derivative of the grand partition function. ¯ (b) Express (∆E)2 in terms of a thermodynamic derivative of E. 2.23 Calculate the density matrix in the x-representation for a free particle within a three-dimensional cube of edge length L: X −βE ˙ ¸ n e x|n n|x ρ(x, x ) = c n

where c is a normalization constant. Assume that L is so large that one can go to the limit of a continuous momentum spectrum Z X 1 L3 d3 p −→ ; x|n −→ x|p = 3/2 eipx/ . 3 (2π) L n

2.24 Calculate the canonical density matrix for a one-dimensional harmonic oscil2 2 lator H = −(2 /2m)(d2 /d2 x) + mω2 x in the x-representation at low temperatures: X −βE ˙ ¸ n ρ(x, x ) = c e x|n n|x , n

where c is the normalization constant. x|n = (π

1/2

n

2 n! x0 )

−1/2 −(x/x0 )2 /2

e

„ Hn

x x0

r

« ;

x0 =

. ωm

The Hermite polynomials are deﬁned in problem 2.27. Suggestion: Consider which state makes the largest contribution.

2.25 Calculate the time average of q 2 for the example of problem 1.7, as well as its average value in the microcanonical ensemble.

74

2. Equilibrium Ensembles

2.26 Show that: Z dq1 . . . dqd f (q 2 , q · k) Z ∞ Z dq q d−1 = (2π)−1 Kd−1 0

π

dΘ(sin Θ)d−2 f (q 2 , qk cos Θ) ,

(2.7.35)

0

where k ∈ Rd is a ﬁxed vector and q = |q|, k = |k|, and Kd = 2−d+1 π −d/2 × ` d ´−1 Γ(2) .

2.27 Compute the matrix elements of the canonical density matrix for a onedimensional harmonic oscillator in the coordinate representation, ˛ ¸ ˛ ¸ ρx,x = x| ρ ˛x = x| e−βH ˛x . Hint: Use the completeness relation for the eigenfunctions of the harmonic oscillator and use the fact that the Hermite polynomials have the integral representation Hn (ξ) = (−1)n eξ

2

„

d dξ

«n

2

2 eξ e−ξ = √ π

Z

∞

(−2iu)n e−u

2

+2iξu

du .

−∞

Alternatively, the ﬁrst representation for Hn (x) and the identity from the next example can be used. Result: ρx,x

» –1/2 mω 1 = Z 2π sinh βω j ”ﬀ mω “ 1 1 . × exp − (x + x )2 tanh βω + (x − x )2 ctgh βω 4 2 2

(2.7.36)

2.28 Prove the following identity: ∂

∂

e ∂x Π ∂x e−x∆x = p

∆ 1 e−x 1+4∆Π x . Det(1 + 4∆Π)

Here, Π and ∆ ate two commuting symmetric matrices, e.g.

∂ ∂ Π ∂x ∂x

≡

∂ Πik ∂x∂ k . ∂xi

3. Thermodynamics

3.1 Thermodynamic Potentials and the Laws of Equilibrium Thermodynamics 3.1.1 Deﬁnitions Thermodynamics treats the macroscopic properties of macroscopic systems. The fact that macroscopic systems can be completely characterized by a small number of variables, such as their energy E, volume V , and particle number N , and that all other quantities, e.g. the entropy, are therefore functions of only these variables, has far-reaching consequences. In this section, we consider equilibrium states and transitions from one equilibrium state to another neighboring equilibrium state. In the preceding sections, we have already determined the change in the entropy due to changes in E, V and N , whereby the system goes from one equilibrium state E, V, N into a new equilibrium state E + dE, V + dV, N + dN . Building upon the diﬀerential entropy (2.4.29), we will investigate in the following the First Law and the signiﬁcance of the quantities which occur in it. Beginning with the internal energy, we will then deﬁne the most important thermodynamic potentials and discuss their properties. We assume the system we are considering to consist of one single type of particles of particle number N . We start with its entropy, which is a function of E, V, and N . Entropy : S = S(E, V, N ) In (2.4.29), we found the diﬀerential entropy to be dS =

1 P µ dE + dV − dN . T T T

(3.1.1)

From this, we can read oﬀ the partial derivatives:

∂S ∂S ∂S 1 P µ , , = = = − , (3.1.2) ∂E V,N T ∂V E,N T ∂N E,V T which naturally agree with the deﬁnitions from equilibrium statistics. We can now imagine the equation S = S(E, V, N ) to have been solved for E and

76

3. Thermodynamics

thereby obtain the energy E, which in thermodynamics is usually termed the internal energy, as a function of S, V, and N . Internal Energy : E = E(S, V, N ) From (3.1.1), we obtain the diﬀerential relation dE = T dS − P dV + µdN .

(3.1.3)

We are now in a position to interpret the individual terms in (3.1.3), keeping in mind all the various possibilities for putting energy into a system. This can be done by performing work, by adding matter (i.e. by increasing the number of particles), and through contact with other bodies, whereby heat is put into the system. The total change in the energy is thus composed of the following contributions: dE

= δQ + δW ⏐ ⏐ ↓

+

heat input mechanical work

δE ⏐N ⏐ ⏐ ⏐

(3.1.3 )

energy increase through addition of matter .

The second term in (3.1.3) is the work performed on the system, δW = −P dV ,

(3.1.4a)

while the third term gives the change in the energy on increasing the particle number δEN = µdN .

(3.1.4b)

The chemical potential µ has the physical meaning of the energy increase on adding one particle to the system (at constant entropy and volume). The ﬁrst term must therefore be the energy change due to heat input δQ, i.e. δQ = T dS .

(3.1.5)

Relation (3.1.3), the law of conservation of energy in thermodynamics, is called the First Law of Thermodynamics. It expresses the change in energy on going from one equilibrium state to another, nearby state an inﬁnitesimal distance away. Equation (3.1.5) is the Second Law for such transitions. We will formulate the Second Law in a more general way later. In this connection, we will also clarify the question of under what conditions these relations of equilibrium thermodynamics can be applied to real thermodynamic processes which proceed at ﬁnite rates, such as for example the operation of steam engines or of internal combustion engines.

3.1 Potentials and Laws of Equilibrium Thermodynamics

77

Remark: It is important to keep the following in mind: δW and δQ do not represent changes of state variables. There are no state functions (functions of E, V and N ) identiﬁable with W und Q. An object cannot be characterized by its ‘heat or work content’, but instead by its internal energy. Heat (∼ energy transfer into an object through contact with other bodies) and work are ways of transferring energy from one body to another. It is often expedient to consider other quantities – with the dimensions of energy – in addition to the internal energy itself. As the ﬁrst of these, we deﬁne the free energy: F ree Energy (Helmholtz F ree Energy) : F = F (T, V, N ) The free energy is deﬁned by F = E − TS = −kT log Z(T, V, N ) ; (3.1.6) in parentheses, we have given its connection with the canonical partition function (Chap. 2). From (3.1.3), the diﬀerential free energy is found to be: dF = −SdT − P dV + µdN with the partial derivatives

∂F ∂F = −S , = −P , ∂T V,N ∂V T,N

(3.1.7)

∂F ∂N

=µ.

(3.1.8)

T,V

We can see from (3.1.8) that the internal energy can be written in terms of F in the form

∂F ∂ F 2 E =F −T = −T . (3.1.9) ∂T V,N ∂T T V,N From (3.1.7), it can be seen that the free energy is that portion of the energy which can be set free as work in an isothermal process; here we assume that the particle number N remains constant. In an isothermal volume change, the change of the free energy is given by (dF )T,N = −P dV = δW , while (dE)T,N = δA, since one would have to transfer heat into or out of the system in order to hold the temperature constant. Enthalpy : H = H(S, P, N ) The enthalpy is deﬁned as H = E + PV .

(3.1.10)

From (3.1.3), it follows that dH = T dS + V dP + µdN and from this, its partial derivatives can be obtained:

(3.1.11)

78

3. Thermodynamics

∂H ∂S

=T ,

P,N

∂H ∂P

=V ,

S,N

∂H ∂N

=µ.

(3.1.12)

S,P

For isobaric processes, (dH)P,N = T dS = δQ = dE + P dV , thus the change in the enthalpy is equal to the change in the internal energy plus the energy change in the device supplying constant pressure (see Fig. 3.1). The weight FG including the piston of area A holds the pressure constant at P = FG /A. The change in the enthalpy is the sum of the change in the internal energy and the change in the potential energy of the weight. For a process at constant pressure, the heat δQ supplied to the system equals the increase in the system’s enthalpy.

FG

Fig. 3.1. The change in the enthalpy in isobaric processes; the weight FG produces the constant pressure P = FG /A, where A is the area of the piston.

F ree Enthalpy (Gibbs F ree Energy) : The Gibbs’ free energy is deﬁned as

G = G(T, P, N )

G = E − TS + PV .

(3.1.13)

Its diﬀerential follows from (3.1.3): dG = −SdT + V dP + µdN .

(3.1.14)

From Eq. (3.1.14), we can immediately read oﬀ

∂G ∂G ∂G = −S , =V , =µ. ∂T P,N ∂P T,N ∂N T,P T he Grand P otential : Φ = Φ(T, V, µ) The grand potential is deﬁned as Φ = E − T S − µN = −kT log ZG (T, V, µ) ;

(3.1.15)

(3.1.16)

in parentheses we give the connection to the grand partition function (Chap. 2). The diﬀerential expressions are dΦ = −SdT − P dV − N dµ ,

∂Φ ∂Φ = −S , = −P , ∂T V,µ ∂V T,µ

(3.1.17)

∂Φ ∂µ

= −N . (3.1.18) T,V

3.1 Potentials and Laws of Equilibrium Thermodynamics

79

3.1.2 The Legendre Transformation The transition from E to the thermodynamic potentials deﬁned in (3.1.6), (3.1.10), (3.1.13), and (3.1.16) was carried out by means of so-called Legendre transformations, whose general structure will now be considered. We begin with a function Y which depends on the variables x1 , x2 , . . ., Y = Y (x1 , x2 , . . .) .

(3.1.19)

The partial derivatives of Y in terms of the xi are

∂Y . ai (x1 , x2 , . . .) = ∂xi {xj ,j =i}

(3.1.20a)

Our goal ∂Yis now to replace the independent variable x1 by the partial derivatives ∂x as independent variables, i.e. for example to change from the 1 independent variable S to T . This has a deﬁnite practical application, since the temperature is directly and readily measurable, while the entropy is not. The total diﬀerential of Y is given by dY = a1 dx1 + a2 dx2 + . . .

(3.1.20b)

From the rearrangement dY = d(a1 x1 ) − x1 da1 + a2 dx2 + . . ., it follows that d(Y − a1 x1 ) = −x1 da1 + a2 dx2 + . . .

.

(3.1.21)

It is then expedient to introduce the function Y1 = Y − a1 x1 ,

(3.1.22)

and to treat it as a function of the variables a1 , x2 , . . . (natural variables).1 Thus, for example, the natural variables of the (Helmholtz) free energy are T, V , and N . The diﬀerential of Y1 (a1 , x2 , . . .) has the following form in terms of these independent variables: dY1 = −x1 da1 + a2 dx2 + . . . and its partial derivatives are

∂Y1 ∂Y1 = −x1 , = a2 , . . . ∂a1 x2 ,... ∂x2 a1 ,...

(3.1.21 a)

(3.1.21 b)

In this manner, one can obtain 8 thermodynamic potentials corresponding to the three pairs of variables. Table 3.1 collects the most important of these, i.e. the ones already introduced above. 1

We make an additional remark here about the geometric signiﬁcance of the Legendre transformation, referring to the case of a single variable: a curve can be represented either as a series of points Y = Y (x1 ), or through the family of its envelopes. In the latter representation, the intercepts of the tangential envelope lines on the ordinate as a function of their slopes a1 are required. This geometric meaning of the Legendre transformation is the basis of the construction of G(T, P ) from F (T, V ) shown in Fig. 3.33. [If one simply eliminated x1 in Y = Y (x1 ) in favor of a1 , then one would indeed obtain Y as a function of a1 , but it would no longer be possible to reconstruct Y (x1 )].

80

3. Thermodynamics

Table 3.1. Energy, entropy, and thermodynamic potentials Independent variables

State function

Diﬀerentials P

Energy E

S, V, {Nj }

dE = T dS − P dV +

Entropy S

E, V, {Nj }

dS =

Free Energy F = E − TS

T, V, {Nj }

dF = −SdT − P dV +

Enthalpy H = E + PV

S, P, {Nj }

dH = T dS + V dP +

Gibbs’ Free Energy G = E − TS + PV

T, P, {Nj }

dG = −SdT + V dP +

Grand Potential P Φ = E − T S − µj Nj

T, V, {µj }

dΦ = −SdT − P dV −

j

µj dNj

j

1 T

dE +

P T

dV −

P

µj T

j

P

dNj

µj dNj

j

P

µj dNj

j

P

µj dNj

j

P

Nj dµj

j

This table contains the generalization to systems with several components (see Sect. 3.9). Nj and µj are the particle number and the chemical potential of the j-th component. The previous formulas are found as a special case when the index j P and j are omitted.

F, H, G and Φ are called thermodynamic potentials, since taking their derivatives with respect to the natural independent variables leads to the conjugate variables, analogously to the derivation of the components of force from the potential in mechanics. For the entropy, this notation is clearly less useful, since entropy does not have the dimensions of an energy. E, F, H, G and Φ are related to each other through Legendre transformations. The natural variables are also termed canonical variables. In a system consisting of only one chemical substance with a ﬁxed number of particles, the state is completely characterized by specifying two quantities, e.g. T and V or V and P . All the other thermodynamic quantities can be calculated from the thermal and the caloric equations of state. If the state is characterized by T and V , then the pressure is given by the (thermal) equation of state P = P (T, V ) . (The explicit form for a particular substance is found from statistical mechanics.) If we plot P against T and V in a three-dimensional graph, we obtain the surface of the equation of state (or P V T surface); see Fig. 2.7 and below in Sect. 3.8.

3.1 Potentials and Laws of Equilibrium Thermodynamics

81

3.1.3 The Gibbs–Duhem Relation in Homogeneous Systems In this section, we will concentrate on the important case of homogeneous thermodynamic systems.2 Consider a system of this kind with the energy E, the volume V , and the particle number N . Now we imagine a second system which is completely similar in its properties but is simply larger by a factor α. Its energy, volume, and particle number are then αE, αV , and αN . Owing to the additivity of the entropy, it is given by S(αE, αV, αN ) = αS(E, V, N ) .

(3.1.23)

As a result, the entropy S is a homogeneous function of ﬁrst order in E, V and N . Correspondingly, E is a homogeneous function of ﬁrst order in S, V and N . There are two types of state variables: E, V, N, S, F, H, G, and Φ are called extensive, since they are proportional to α when the system is enlarged as described above. T, P, and µ are intensive, since they are independent of α; e.g. we ﬁnd T −1 =

∂αS ∂S = ∼ α0 , ∂E ∂αE

and this independence follows in a similar manner from the deﬁnitions of the other intensive variables, also. We wish to investigate the consequences of the homogeneity of S [Eq. (3.1.23)]. To this end, we diﬀerentiate (3.1.23) with respect to α and then set α = 1: ∂S ∂S ∂S E+ V + N =S. ∂αE ∂αV ∂αN α=1 From this, we ﬁnd using (3.1.2) that −S + E = T S − P V + µN .

1 TE

+

P T

V −

µ TN

= 0, that is (3.1.24)

This is the Gibbs–Duhem relation. Together with dE = T dS − P dV + µdN , we derive from Eq. (3.1.24) SdT − V dP + N dµ = 0 ,

(3.1.24 )

the diﬀerential Gibbs–Duhem relation. It states that in a homogeneous system, T, P and µ cannot be varied independently, and it gives the relationship between the variations of these intensive quantities.3 The following expressions can be derived from the Gibbs–Duhem relation: 2

3

Homogeneous systems have the same speciﬁc properties in all spatial regions; they may also consist of several types of particles. Examples of inhomogeneous systems are those in a position-dependent potential and systems consisting of several phases which are in equilibrium, although in this case the individual phases can still be homogeneous. The generalization to systems with several components is given in Sect. 3.9, Eq. (3.9.7).

82

3. Thermodynamics

G(T, P, N ) = µ(T, P ) N

(3.1.25)

and Φ(T, V, µ) = −P (T, µ) V .

(3.1.26)

Justiﬁcation: from the deﬁnition (3.1.13), it `follows (3.1.24) that ´ immediately ´ ` ∂µusing ∂G G = µN , and from (3.1.15) we ﬁnd µ = ∂N = µ + N ; it follows ∂N T,P T,P that µ must be independent of N . We have thus demonstrated (3.1.25). Similarly, ` ∂Φ ´ , P must be it follows from (3.1.16) that Φ = −P V , and due to −P = ∂V T,µ independent of V . Further conclusions following from homogeneity (in the canonical ensemble with independent variables T, V, and N ) can be obtained starting with P (T, V, N ) = P (T, αV, αN )

and

µ(T, V, N ) = µ(T, αV, αN )

(3.1.27a,b)

again by taking derivatives with respect to α around the point α = 1: „

∂P ∂V

„

« V + T,N

∂P ∂N

„

« N =0 T,V

and

∂µ ∂V

„

« V + T,N

∂µ ∂N

« N =0. T,V

(3.1.28a,b) These two relations merely state that for intensive quantities, a volume increase is equivalent to a decrease in the number of particles.

3.2 Derivatives of Thermodynamic Quantities 3.2.1 Deﬁnitions In this section, we will deﬁne the most important thermodynamic derivatives. In the following deﬁnitions, the particle number is always held constant. The heat capacity is deﬁned as C=

δQ dS =T . dT dT

(3.2.1)

It gives the quantity of heat which is required to raise the temperature of a body by 1 K. We still have to specify which thermodynamic variables are held constant during this heat transfer. The most important cases are that the volume or the pressure is held constant. If the heat is transferred at constant volume, the heat capacity at constant volume is relevant:

∂S ∂E CV = T = . (3.2.2a) ∂T V,N ∂T V,N

3.2 Derivatives of Thermodynamic Quantities

83

In rearranging (∂S/∂T )V,N , we have used Eq. (3.1.1). If the heat transfer takes place under constant pressure, then the heat capacity at constant pressure from (3.2.1) must be used:

∂S ∂H CP = T = . (3.2.2b) ∂T P,N ∂T P,N For the rearrangement of the deﬁnition, we employed (3.1.11). If we divide the heat capacity by the mass of the substance or body, we obtain the speciﬁc heat, in general denoted as c, or cV at constant volume or cP at constant pressure. The speciﬁc heat is measured in units of J kg−1 K−1 . The speciﬁc heat may also be referred to 1 g and quoted in the (non-SI) units cal g−1 K−1 . The molar heat capacity (heat capacity per mole) gives the heat capacity of one mole of the substance. It is obtained from the speciﬁc heat referred to 1 g, multiplied by the molecular weight of the substance. Remark: We will later show in general using Eq. (3.2.24) that the speciﬁc heat at constant pressure is larger than that at constant volume. The physical origin of this diﬀerence can be readily seen by writing Law the ` ∂E ´ for constant N in ´ form ` ´the First` ∂E dT + ∂V T dV = CV dT + ∂V dV , δQ = dE + P dV and setting dE = ∂E ∂T V T that is » „ « – ∂E dV . δQ = CV dT + P + ∂V T In addition to the quantity of heat CV dT necessary for warming at constant volume, when V is increased, more heat is consumed by the work against the pressure, P dV , ` ´ , it then and by the change in the internal energy, (∂E/∂V )T dV . For CP = δQ dT P follows from the last relation that « «„ « „ „ ∂V ∂E CP = C V + P + . ∂V T ∂T P

Further important thermodynamic derivatives are the compressibility, the coeﬃcient of thermal expansion, and the thermal pressure coeﬃcient. The compressibility is deﬁned in general by κ=−

1 dV . V dP

It is a measure of the relative volume decrease on increasing the pressure. For compression at a constant temperature, the isothermal compressibility, deﬁned by

1 ∂V κT = − (3.2.3a) V ∂P T,N is the relevant quantity. For (reversible) processes in which no heat is transferred, i.e. when the entropy remains constant, the adiabatic (isentropic) compressibility

84

3. Thermodynamics

κS = −

1 V

∂V ∂P

(3.2.3b) S,N

must be introduced. The coeﬃcient of thermal expansion is deﬁned as

1 ∂V α= . (3.2.4) V ∂T P,N The deﬁnition of the thermal pressure coeﬃcient is given by

1 ∂P β= . P ∂T V,N

(3.2.5)

Quantities such as C, κ, and α are examples of so-called susceptibilities. They indicate how strongly an extensive quantity varies on changing (increasing) an intensive quantity. 3.2.2 Integrability and the Maxwell Relations 3.2.2.1 The Maxwell Relations The Maxwell relations are expressions relating the thermodynamic derivatives; they follow from the integrability conditions. From the total diﬀerential of the function Y = Y (x1 , x2 ) dY = a1 dx1 + a2 dx2 ,

∂Y ∂Y a1 = , a2 = ∂x1 x2 ∂x2 x1

(3.2.6)

we ﬁnd as a result of the commutatitivity of the order of the derivatives, ∂a2 ∂a1 ∂2 Y ∂2Y ∂x2 x = ∂x2 ∂x1 = ∂x1 ∂x2 = ∂x1 x the following integrability condition: 1

∂a1 ∂x2

2

= x1

∂a2 ∂x1

.

(3.2.7)

x2

All together, there are 12 diﬀerent Maxwell relations. The relations for ﬁxed N are:

∂P ∂S ∂P ∂T =− , F : = (3.2.8a,b) E: ∂V S ∂S V ∂V T ∂T V

∂T ∂V ∂S ∂P H: = or = (3.2.9) ∂P S ∂S P ∂V P ∂T S

∂V ∂S =− = −V α . (3.2.10) G: ∂P T ∂T P

3.2 Derivatives of Thermodynamic Quantities

85

Here, we have labeled the Maxwell relations with the quantity from whose differential the relation is derived. There are also relations containing N and µ; of these, we shall require the following in this book:

∂µ ∂P F : =− . (3.2.11) ∂V T,N ∂N T,V Applying this relation to homogeneous systems, we ﬁnd from (3.1.28a) and (3.1.28b):

∂µ V V ∂P ∂µ =− = ∂N T,V N ∂V T,N N ∂N T,V (3.2.12)

V 2 ∂P V 1 =− 2 = 2 . N ∂V T,N N κT ∗

3.2.2.2 Integrability Conditions, Exact and Inexact Diﬀerentials

It may be helpful at this point to show the connection between the integrability conditions and the results of vector analysis as they apply to classical mechanics. We consider a vector ﬁeld F(x), which is deﬁned within the simply-connected region G (this ﬁeld could for example be a force ﬁeld). Then the following statements are equivalent: F(x) = −∇V (x) x with V (x) = − x0 dx F(x ), where x0 is an arbitrary ﬁxed point of origin and the line integral is to be taken along an arbitrary path from x0 to x. This means that F(x) can be derived from a potential. (I)

(II) (III) (IV)

curl F = 0 dx F(x) = 0 x2 x1 dx F(x)

at each point in G. along each closed path in G. is independent of the path.

Let us return to thermodynamics. We consider a system characterized by two independent thermodynamic variables x and y and a quantity whose diﬀerential variation is given by dY = A(x, y)dx + B(x, y)dy .

(3.2.13)

In the notation of mechanics, F = (A(x, y), B(x, y), 0). The existence of a state variable Y , i.e. a state function Y (x, y) (Statement (I )) is equivalent to each of the three other statements (II ,III , and IV ). (I ) (II ) (III ) (IV )

A state function Y (x, y) exists, with (x,y) Y (x, y) = Y (x0 , y0 ) + (x0 ,y0 ) dx A(x , y ) + dy B(x , y ) . ∂B ∂A ∂x y = ∂y x dxA(x, y) + dyB(x, y) = 0 P1 P0 dxA(x, y) + dyB(x, y) is independent of the path.

86

3. Thermodynamics

Fig. 3.2. Illustrating the path integrals III and IV

The diﬀerential (3.2.13) is called an exact diﬀerential (or a perfect diﬀerential) when the coeﬃcients A and B fulﬁll the integrability condition (II ).

3.2.2.3 The Non-integrability of δQ and δW We can now prove that δQ and δW are not integrable. We ﬁrst consider δW and imagine the independent thermodynamic variables to be V and T . Then the relation (3.1.4a) becomes δW = −P dV + 0 · dT .

(3.2.14)

The derivative ofthe pressure with respect to the temperature at constant volume is nonzero, ∂P ∂T V = 0, while of course the derivative of zero with respect to V gives zero. That is, the integrability condition is not fulﬁlled. Analogously, we write (3.1.5) in the form δQ = T dS + 0 · dV . ∂T

(3.2.15)

∂S )T ( ∂V ( ∂P ∂T )V = − = 0, i.e. the integrability ∂S ∂S ∂V S ( ∂T )V ( ∂T )V condition is not fulﬁlled. Therefore, there are no state functions W (V, T, N ) and Q(V, T, N ) whose diﬀerentials are equal to δW and δQ. This is the reason for the diﬀerent notation used in the diﬀerential signs. The expressions relating the heat transferred to the system and the work performed on it to the state variables exist only in diﬀerential form. One can, of course, compute the integral 1 δQ = 1 T dS along a given path (e.g. 1 in Fig. 3.2), and similarly for δW , but the values of these integrals depend not only on their starting and end points, but also on the details of the path which connects those points.

Again, we have

= −

Remark: In the case that a diﬀerential does not fulﬁll the integrability condition, δY = A(x, y)dx + B(x, y)dy ,

3.2 Derivatives of Thermodynamic Quantities

87

but can be converted into an exact diﬀerential through multiplication by a factor g(x, y), then g(x, y) is termed an integrating factor. Thus, T1 is an integrating factor for δQ. In statistical mechanics, it is found quite naturally that the entropy is a state function, i.e. dS is an exact diﬀerential. In the historical development of thermodynamics, it was a decisive and nontrivial discovery that multiplication of δQ by T1 yields an exact diﬀerential. 3.2.3 Jacobians It is often necessary to transform from one pair of thermodynamic variables to a diﬀerent pair. For the necessary recalculation of the thermodynamic derivatives, it is expedient to use Jacobians. In the following, we consider functions of two variables: f (u, v) and g(u, v). We deﬁne the Jacobian determinant: ∂f

∂f ∂(f, g) ∂u v ∂v u ∂f ∂g ∂g ∂f = = ∂g − . (3.2.16) ∂g ∂(u, v) ∂u ∂v ∂v ∂u v u u v ∂u

∂v

v

u

This Jacobian fulﬁlls a series of important relations. Let u = u(x, y) and v = v(x, y) be functions of x and y; then the following chain rule can be proved in an elementary fashion: ∂(f, g) ∂(f, g) ∂(u, v) = . ∂(x, y) ∂(u, v) ∂(x, y)

(3.2.17)

This relation is important for the changes of variables which are frequently needed in thermodynamics. Setting g = v, the deﬁnition (3.2.16) is simpliﬁed to

∂(f, v) ∂f = . (3.2.18) ∂(u, v) ∂u v Since a determinant changes its sign on interchanging two columns, we have ∂(f, g) ∂(f, g) =− . ∂(v, u) ∂(u, v)

(3.2.19)

If we apply the chain rule (3.2.17) for x = f and y = g, we ﬁnd: ∂(f, g) ∂(u, v) =1. ∂(u, v) ∂(f, g) Setting g = v in (3.2.20), we obtain with (3.2.18)

∂f 1 = . ∂u ∂u v ∂f

v

(3.2.20)

(3.2.20 )

88

3. Thermodynamics

Finally, from (3.2.18) we have

∂f ∂u

v

∂f

∂v ∂(f, v) ∂(f, u) ∂(f, v) = = − ∂u u . = ∂(u, v) ∂(f, u) ∂(u, v) ∂v f

(3.2.21)

Using this relation, one can thus transform a derivative at constant v into derivatives at constant u and f . The relations given here can also be applied to functions of more than two variables, provided the additional variables are held constant. 3.2.4 Examples (i) We ﬁrst derive some useful relations between the thermodynamic derivatives. Using Eqns. (3.2.21), (3.2.3a), and (3.2.4), we obtain ∂V

∂P α ∂T P = = − ∂V . (3.2.22) ∂T V κ T ∂P T Thus, the thermal pressure coeﬃcient β = P1 ∂P ∂T V [Eq. (3.2.5)] is related to the coeﬃcient of thermal expansion α and the isothermal compressibility κT . In problem 3.4, it is shown that CP κT = CV κS

(3.2.23)

[cf. (3.2.3a,b)]. Furthermore, we see that ∂ (S, V ) ∂ (T, P ) ∂ (S, V ) =T = ∂ (T, V ) ∂ (T, P ) ∂ (T, V )

∂P ∂S ∂V ∂V ∂S =T = − ∂V T ∂T P ∂P T ∂P T ∂T P ∂S ∂V ∂V 2

CV = T

= CP − T

∂P T ∂T ∂V ∂P T

P

∂T P . = CP + T ∂V ∂P T

Here, the Maxwell relation (3.2.10) was used. Thus we ﬁnd for the heat capacities CP − CV =

T V α2 . κT

(3.2.24)

With κT CP − κT CV = T V α2 and κT CV = κS CP , it follows that the compressibilities obey the relation κT − κ S =

T V α2 . CP

(3.2.25)

It follows from (3.2.24) that the two heat capacities can become equal only when the coeﬃcient of expansion α vanishes or κT becomes very large. The former occurs in the case of water at 4◦ C.

3.3 Fluctuations and Thermodynamic Inequalities

89

(ii) We now evaluate the thermodynamic derivatives for the classical ideal gas, based on Sect. 2.7 . For the enthalpy H = E + P V , it follows from Eqns. (2.7.25) and (2.7.28) that H=

5 N kT . 2

(3.2.26)

Then, for the heat capacities, we ﬁnd

3 5 ∂E ∂H CP = CV = = Nk , = Nk ; ∂T V 2 ∂T P 2 and for the compressibilities,

1 ∂V 1 , κT = − = V ∂P T P

κS = κT

CV 3 , = CP 5P

(3.2.27)

(3.2.28)

ﬁnally, for the thermal expansion coeﬃcient and the thermal pressure coeﬃcient, we ﬁnd

1 ∂V 1 ∂P 1 1 α 1 α= and β = . (3.2.29a,b) = = = V ∂T P T P ∂T V P κT T

3.3 Fluctuations and Thermodynamic Inequalities This section is concerned with ﬂuctuations of the energy and the particle number, and belongs contextually to the preceding chapter. We are only now treating these phenomena because the ﬁnal results are expressed in terms of thermodynamic derivatives, whose deﬁnitions and properties are only now at our disposal. 3.3.1 Fluctuations 1. We consider a canonical ensemble, characterized by the temperature T , the volume V , the ﬁxed particle number N , and the density matrix ρ=

e−βH , Z

Z = Tr e−βH .

The average value of the energy [Eq. (2.6.37)] is given by ¯ = 1 Tr e−βH H = 1 ∂Z . E Z Z ∂(−β) Taking the temperature derivative of (3.3.1),

¯ ¯ 1 1 ∂E ∂E 1 2 2 H − H = = (∆E)2 , = 2 2 ∂T V kT ∂(−β) kT kT 2

(3.3.1)

90

3. Thermodynamics

we obtain after substitution of (3.2.2a) the following relation between the speciﬁc heat at constant volume and the mean square deviation of the internal energy: CV =

1 (∆E)2 . kT 2

(3.3.2)

2. Next, we start with the grand canonical ensemble, characterized by T, V, µ, and the density matrix −1 −β(H−µN ) ρG = Z G e ,

ZG = Tr e−β(H−µN ) .

The average particle number is given by ¯ = Tr ρG N = kT Z −1 ∂ZG . N G ∂µ

(3.3.3)

Its derivative with respect to the chemical potential is

¯ ∂N ¯ 2 = β(∆N )2 . = β N2 − N ∂µ T,V If we replace the left side by (3.2.12), we obtain the following relation between the isothermal compressibility and the mean square deviation of the particle number:

∂N 1 ∂V V V κT = − = 2 = 2 β(∆N )2 . (3.3.4) V ∂P T,N N ∂µ T,V N Eqns. (3.3.2) and (3.3.4) are fundamental examples of relations between susceptibilities (on the left-hand sides) and ﬂuctuations, so called ﬂuctuationresponse theorems. 3.3.2 Inequalities From the relations derived in 3.3.1, we derive (as a result of the positivity of the ﬂuctuations) the following inequalities: κT ≥ 0 ,

(3.3.5)

CP ≥ CV ≥ 0 .

(3.3.6)

In (3.3.6), we have used the fact that according to (3.2.24) and (3.3.5), CP is larger than CV . On decreasing the volume, the pressure increases. On increasing the energy, the temperature increases. The validity of these inequalities is a precondition for the stability of matter. If, for example, (3.3.5) were not valid, compression of the system would decrease its pressure; it would thus be further compressed and would ﬁnally collapse.

3.4 Absolute Temperature and Empirical Temperatures

91

3.4 Absolute Temperature and Empirical Temperatures The absolute temperature was deﬁned in (2.4.4) as T −1 =

∂S(E,V,N ) ∂E

. V,N

Experimentally, one uses a temperature ϑ, which is for example given by the length of a rod or a column of mercury, or the volume or the pressure of a gas thermometer. We assume that the empirical temperature ϑ increases monotonically with T , i.e. that ϑ also increases when we put heat into the system. We now seek a method of determining the absolute temperature from ϑ, that is, we seek the relation T = T (ϑ). To this end, we start with the thermodynamic difference quotient δQ : dP

δQ dP

=T

T

∂S ∂P

= −T T

∂V ∂T

T

= −T P

∂V ∂ϑ

P

dϑ . dT

(3.4.1)

Here, we have substituted in turn δQ = T dS, the Maxwell relation (3.2.10), and T = T (ϑ). It follows that ∂V

1 dT dP ∂V = − ∂ϑ P = − . (3.4.2) δQ T dϑ ∂ϑ P δQ ϑ dP

T

This expression is valid for any substance. The right-hand side can be measured experimentally and yields a function of ϑ. Therefore, (3.4.2) represents an ordinary inhomogeneous diﬀerential equation for T (ϑ), whose integration yields T = const · f (ϑ) .

(3.4.3)

We thus obtain a unique relation between the empirical temperature ϑ and the absolute temperature. The constant can be chosen freely due to the arbitrary nature of the empirical temperature scale. The absolute temperature scale is determined by deﬁning the triple point of water to be Tt = 273.16 K. For magnetic thermometers, it follows from (cf. Chap. 6), analogously, „ « „ « 1 dT ∂M dB = . T dϑ ∂ϑ B δQ ϑ

The absolute temperature

−1 ∂S T = ∂E V,N

` δQ ´ dB T

= T

` ∂S ´ ∂B T

= T

` ∂M ´ ∂T

B

(3.4.4)

(3.4.5)

is positive, since the number of accessible states (∝ Ω(E)) is a rapidly increasing function of the energy. The minimum value of the absolute temperature

92

3. Thermodynamics

is T = 0 (except for systems which have energetic upper bounds, such as an assembly of paramagnetic spins). This follows from the distribution of energy levels E in the neighborhood of the ground-state energy E0 . We can see from the models which we have already evaluated explicitly (quantum-mechanical harmonic oscillators, paramagnetic moments: Sects. 2.5.2.1 and 2.5.2.2) that limE→E0 S (E) = ∞, and thus for these systems, which are generic with respect to their low-lying energy levels, lim T = 0 .

E→E0

We return once more to the determination of the temperature scale through Eq. (3.4.3) in terms of Tt = 273.16 K. As mentioned in Sect. 2.3, the value of the Boltzmann constant is also ﬁxed by this relation. In order to see this, we consider a system whose equation of state at Tt is known. Molecular hydrogen can be treated as an ideal gas at Tt and P = 1 atm. The density of H2 under these conditions is ρ = 8.989 × 10−2 g/liter = 8.989 × 10−5 g/cm−3 . Its molar volume then has the value 2.016 g VM = = 22.414 liters . 8.989 × 10−2 g liters−1 One mole is deﬁned as: 1 mole corresponds to a mass equal to the atomic weight in g (e.g. a mole of H2 has a mass of 2.016 g). From this fact, we can determine the Boltzmann constant : PV 1 atm VM = = 1.38066 × 10−16 erg/K NT NA × 273.16 K = 1.38066 × 10−23 J/K .

k=

(3.4.6)

Here, Avogadro’s number was used: NA ≡ number of molecules per mole 2.016 g 2.016 g = 6.0221 × 1023 mol−1 . = = mass of H2 2 × 1.6734 × 10−24 g Further deﬁnitions of units and constants, e.g. the gas constant R, are given in Appendix I.

3.5 Thermodynamic Processes In this section, we want to treat thermodynamic processes, i.e. processes which either during the whole course of their time development or at least in their initial or ﬁnal stages can be suﬃciently well described by thermodynamics.

3.5 Thermodynamic Processes

93

3.5.1 Thermodynamic Concepts We begin by introducing several concepts of thermodynamics which we will later use repeatedly (cf. Table 3.2). Processes in which the pressure is held constant, i.e. P = const, are called isobaric; those in which the volume remains constant, V = const, are isochoral; those in which the entropy is constant, S = const, are isentropic; and those in which no heat is transferred, i.e. δQ = 0, are termed adiabatic (thermally isolated). Table 3.2. Some thermodynamic concepts Concept isobaric isochoral isothermal isentropic adiabatic extensive intensive

Deﬁnition P = const. V = const. T = const. S = const. δQ = 0 proportional to the size of the system independent of the size of the system

We mention here another deﬁnition of the terms extensive and intensive, which is equivalent to the one given in the section on the Gibbs–Duhem relation. We divide a system that is characterized by the thermodynamic variable Y into two parts, which are themselves characterized by Y1 and Y2 . In the case that Y1 + Y2 = Y , Y is called extensive; when Y1 = Y2 = Y , it is termed intensive (see Fig. 3.3).

Fig. 3.3. The deﬁnition of extensive and intensive thermodynamic variables

Extensive variable include: V, N, E, S, the thermodynamic potentials, the electric polarization P, and the magnetization M. Intensive variables include: P, µ, T , the electric ﬁeld E, and the magnetic ﬁeld B. Quasistatic process: a quasistatic process takes place slowly with respect to the characteristic relaxation time of the system, i.e. the time within which

94

3. Thermodynamics

the system passes from a nonequilibrium state to an equilibrium state, so that the system remains in equilibrium at each moment during such a process. Typical relaxation times are of the order of τ = 10−10 − 10−9 sec. An irreversible process is one which cannot take place in the reverse direction, e.g. the transition from a nonequilibrium state to an equilibrium state (the initial state could also be derived from an equilibrium state with restrictions by lifting of those restrictions). Experience shows that a system which is not in equilibrium moves towards equilibrium; in this process, its entropy increases. The system then remains in equilibrium and does not return to the nonequilibrium state. Reversible processes: reversible processes are those which can also occur in the reverse direction. An essential attribute of reversibility is that a process which takes place in a certain direction can be followed by the reverse process in such a manner that no changes in the surroundings remain. The characterization of a thermodynamic state (with a ﬁxed particle number N ) can be accomplished by specifying two quantities, e.g. T and V , or P and V . The remaining quantities can be found from the thermal and the caloric equations of state. A system in which a quasistatic process is occurring, i.e. which is in thermal equilibrium at each moment in time, can be represented by a curve, for example in a P –V diagram (Fig. 2.7b). A reversible process must in all cases be quasistatic. In non-quasistatic processes, turbulent ﬂows and temperature ﬂuctuations take place, leading to the irreversible production of heat. The intermediate states in a nonquasistatic process can furthermore not be suﬃciently characterized by P and V . One requires for their characterization more degrees of freedom, or in other words, a space of higher dimensionality. There are also quasistatic processes which are irreversibe (e.g. temperature equalization via a poor heat conductor, 3.6.3.1; or a Gay-Lussac experiment carried out slowly, 3.6.3.6). Even in such processes, equilibrium thermodynamics is valid for the individual components of the system. Remark: We note that thermodynamics rests on equilibrium statistical mechanics. In reversible processes, the course of events is so slow that the system is in equilibrium at each moment; in irreversible processes, this is true of at least the initial and ﬁnal states, and thermodynamics can be applied to these states. In the following sections, we will clarify the concepts just introduced on the basis of some typical examples. In particular, we will investigate how the entropy changes during the course of a process.

3.5 Thermodynamic Processes

95

3.5.2 The Irreversible Expansion of a Gas; the Gay-Lussac Experiment (1807) The Gay-Lussac experiment4 deals with the adiabatic expansion of a gas and is carried out as follows: a container of volume V which is insulated from its surroundings is divided by partition into two subvolumes, V1 and V2 . Initially, the volume V1 contains a gas at a temperature T , while V2 is evacuated. The partition is then removed and the gas ﬂows rapidly into V2 (Fig. 3.4).

Fig. 3.4. The Gay-Lussac experiment

After the gas has reached equilibrium in the whole volume V = V1 + V2 , its thermodynamic quantities are determined. We ﬁrst assume that this experiment is carried out using an ideal gas. The initial state is completely characterized by its volume V1 and the temperature T . The entropy and the pressure before the expansion are, from (2.7.27) and (2.7.25), given by

5 V1 /N N kT S = Nk + log and P = , 2 λ3 V1 with the thermal wavelength λ: h λ= √ . 2πmkT In the ﬁnal state, the volume is now V = V1 + V2 . The temperature is still equal to T , since the energy remains constant and the caloric equation of state of ideal gases, E = 32 kT N , contains no dependence on the volume. The entropy and the pressure after the expansion are:

V /N 5 N kT + log 3 . , P = S = N k 2 λ V We can see that in this process, there is an entropy production of ∆S = S − S = N k log 4

V >0. V1

(3.5.1)

Louis Joseph Gay-Lussac, 1778–1850. The goal of Gay-Lussac’s experiments was to determine the volume dependence of the internal energy of gases.

96

3. Thermodynamics

It is intuitively clear that the process is irreversible. Since the entropy increases and no heat is transferred, (δQ = 0), the mathematical criterion for an irreversible process, Eq. (3.6.8) (which remains to be proved), is fulﬁlled. The initial and ﬁnal states in the Gay-Lussac experiment are equilibrium states and can be treated with equilibrium thermodynamics. The intermediate states are in general not equilibrium states, and equilibrium thermodynamics can therefore make no statements about them. Only when the expansion is carried out as a quasistatic process can equilibrium thermodynamics be applied at each moment. This would be the case if the expansion were carried out by allowing a piston to move slowly (either by moving a frictionless piston in a series of small steps without performing work, or by slowing the expansion of the gas by means of the friction of the piston and transferring the resulting frictional heat back into the gas). For an arbitrary isolated gas, the temperature change per unit volume at constant energy is given by

∂T ∂V

∂E

= E

∂V T − ∂E ∂T V

=−

T

∂S ∂V

T

CV

−P

=

1 CV

∂P P −T , (3.5.2a) ∂T V

∂S where the Maxwell relation ∂V = ∂P ∂T V has been employed. This coT eﬃcient has the value 0 for an ideal gas, but for real gases it can have either a positive or a negative sign. The entropy production is, owing to dE = T dS − P dV = 0, given by

∂S ∂V

= E

P >0, T

(3.5.2b)

i.e. dS > 0. Furthermore, no heat is exchanged with the surroundings, that is, δQ = 0. Therefore, it follows that the inequality between the change in the entropy and the quantity of heat transferred T dS > δQ

(3.5.3)

holds here. The coeﬃcients calculated from equilibrium thermodynamics (3.5.2a,b) can be applied to the whole course of the Gay-Lussac experiment if the process is carried out in a quasistatic manner. Yet it remains an irreversible process! By integration of (3.5.2a,b), one obtains the diﬀerences in temperature and entropy between the ﬁnal and initial states. The result can by the way also be applied to the non-quasistatic irreversible process, since the two ﬁnal states are identical. We shall return to the quasistatic, irreversible Gay-Lussac experiment in 3.6.3.6.

3.5 Thermodynamic Processes

97

3.5.3 The Statistical Foundation of Irreversibility How irreversible is the Gay-Lussac process? In order to understand why the Gay-Lussac experiment is irreversible, we consider the case that the volume increase δV fulﬁlls the inequality δV V , where V now means the initial volume (see Fig. 3.5).

Fig. 3.5. Illustration of the Gay-Lussac experiment

In the expansion from V to V + δV , the phase-space surface changes from Ω(E, V ) to Ω(E, V + δV ), and therefore the entropy changes from S(E, V ) to S(E, V + δV ). After the gas has carried out this expansion, we ask what the probability would be of ﬁnding the system in only the subvolume V . Employing (1.3.2), (2.2.4), and (2.3.4), we ﬁnd this probability to be given by Ω(E, V ) dq dp δ(H − E) W (E, V ) = = = (3.5.4) 3N N ! h Ω(E, V + δV ) Ω(E, V + δV ) V

= e−(S(E,V +δV )−S(E,V ))/k = = e−( ∂V )E δV /k = e− T δV /k = e− V ∂S

P

δV

N

1,

where in the last rearrangement, we have assumed an ideal gas. Due to the factor N ≈ 1023 in the exponent, the probability that the system will return spontaneously to the volume V is vanishingly small. In general, it is found that for the probability, a constraint (a restriction C) occurs spontaneously: W (E, C) = e−(S(E)−S(E,C))/k .

(3.5.5)

We ﬁnd that S(E, C) S(E), since under the constraint, fewer states are accessible. The diﬀerence S(E) − S(E, C) is macroscopic; in the case of the change in volume, it was proportional to N δV /V , and the probability W (E, C) ∼ e−N is thus practically zero. The transition from a state with a constraint C to one without this restriction is irreversible, since the probability that the system will spontaneously search out a state with this constraint is vanishingly small.

98

3. Thermodynamics

3.5.4 Reversible Processes In the ﬁrst subsection, we consider the reversible isothermal and adiabatic expansion of ideal gases, which illustrate the concept of reversibility and are important in their own right as elements of thermodynamic processes. 3.5.4.1 Typical Examples: the Reversible Expansion of a Gas In the reversible expansion of an ideal gas, work is performed on a spring by the expanding gas and energy is stored in the spring (Fig. 3.6). This energy can later be used to compress the gas again; the process is thus reversible. It can be seen as a reversible variation of the Gay-Lussac experiment. Such a process can be carried out isothermally or adiabatically.

Fig. 3.6. The reversible isothermal expansion of a gas, where the work performed is stored by a spring. The work performed by the gas is equal to the area below the isotherm in the P − V diagram.

a) Isothermal Expansion of a Gas, T = const. We ﬁrst consider the isothermal expansion. Here, the gas container is in a heat bath at a temperature T . On expansion from the initial volume V1 to the ﬁnal volume V , the gas performs the work:5 V W=

V P dV =

V1

dV

V N kT = N kT log . V V1

(3.5.6)

V1

This work can be visualized as the area below the isotherm in the P − V diagram (Fig. 3.6). Since the temperature remains constant, the energy of the ideal gas is also unchanged. Therefore, the heat bath must transfer a quantity of heat 5

We distinguish the work performed by the system (W) from work performed on the system (W ), we use diﬀerent symbols, implying opposite signs: W = −W .

3.5 Thermodynamic Processes

Q=W

99

(3.5.7)

to the system. The change in the entropy during this isothermal expansion is given according to (2.7.27) by: ∆S = N k log

V . V1

(3.5.8)

Comparison of (3.5.6) with (3.5.8) shows us that the entropy increase and the quantity of heat taken up by the system here obey the following relation: ∆S =

Q . T

(3.5.9)

This process is reversible, since using the energy stored in the spring, one could compress the gas back to its original volume. In this compression, the gas would release the quantity of heat Q to the heat bath. The ﬁnal state of the system and its surroundings would then again be identical to their original state. In order for the process to occur in a quasistatic way, the strength of the spring must be varied during the expansion or compression in such a way that it exactly compensates the gas pressure P (see the discussion in Sect. 3.5.4.2). One could imagine the storage and release of the energy from the work of compression or expansion in an idealized thought experiment to be carried out by the horizontal displacement of small weights, which would cost no energy. We return again to the example of the irreversible expansion (Sect. 3.5.2). Clearly, by performing work in this case we could also compress the gas after its expansion back to its original volume, but then we would increase its energy in the process. The work required for this compression is ﬁnite and its magnitude is proportional to the change in volume; it cannot, in contrast to the case of reversible processes, in principle be made equal to zero. b) Adiabatic Expansion of a Gas, ∆Q = 0 We now turn to the adiabatic reversible expansion. In contrast to Fig. 3.6, the gas container is now insulated from its surroundings, and the curves in the P -V diagram are steeper. In every step of the process, δQ = 0, and since work is here also performed by the gas on its surroundings, it cools on expansion. It then follows from the First Law that dE = −P dV . If we insert the caloric and the thermal equations of state into this equation, we ﬁnd: dT 2 dV =− . T 3 V

(3.5.10)

Integration of the last equation leads to the two forms of the adiabatic equation for an ideal gas:

100

3. Thermodynamics

2/3 T = T1 V1 /V

2/3

and P = N kT1 V1

V −5/3 ,

(3.5.11a,b)

where the equation of state was again used to obtain b. We now once more determine the work W(V ) performed on expansion from V1 to V . It is clearly less than in the case of the isothermal expansion, since no heat is transferred from the surroundings. Correspondingly, the area beneath the adiabats is smaller than that beneath the isotherms (cf. Fig. 3.7). Inserting Eq. (3.5.11b) yields for the work:

Fig. 3.7. An isotherm and an adiabat passing through the initial point (P1 , V1 ), with P1 = N kT1 /V1

V W(V ) =

−2/3 V 3 ; dV P = N kT1 1 − 2 V1

(3.5.12)

V1

geometrically, this is the area beneath the adiabats, Fig. 3.7. The change in the entropy is given by

V λ31 ∆S = N k log =0, (3.5.13) λ3 V1 and it is equal to zero. We are dealing here with a reversible process in an isolated systems, (∆Q = 0), and ﬁnd ∆S = 0, i.e. the entropy remains unchanged. This is not surprising, since for each inﬁnitesimal step in the process, T dS = δQ = 0

(3.5.14)

holds. ∗

3.5.4.2 General Considerations of Real, Reversible Processes

We wish to consider to what extent the situation of a reversible process can indeed be realized in practice. If the process can occur in both directions, what decides in which direction it in fact proceeds? To answer this question, in Fig. 3.8 we consider a process which takes place between the points 1 and 2.

3.5 Thermodynamic Processes

101

Fig. 3.8. A reversible process. P is the internal pressure of the system (solid line). Pa is the external pressure produced by the spring (dashed line).

The solid curve can be an isotherm or a polytrope (i.e. an equilibrium curve which lies between isotherms and adiabats). Along the path from 1 to 2, the working substance expands, and from 2 to 1, is is compressed again, back to its initial state 1 without leaving any change in the surroundings. At each moment, the pressure within the working substance is precisely compensated by the external pressure (produced here by a spring). This quasistatic reversible process is, of course, an idealization. In order for the expansion to occur at all, the external pressure PaEx must be somewhat lower than P during the expansion phase of the process. The external pressure is indicated in Fig. 3.8 by the dashed curve. This curve, which is supposed to characterize the real course of the process, is drawn in Fig. 3.8 as a dashed line, to indicate that a curve in the P − V diagram cannot fully characterize the system. In the expansion phase with Pa < P , the gas near the piston is somewhat rareﬁed. This eﬀectively reduces its pressure and the work performed by the gas is slightly less than would correspond to its actual pressure. Density gradients occur, i.e. there is a non-equilibriumstate. The 2 work obtained (which is stored as potential energy in the spring), 1 dV PaEx , then obeys the inequality 2

dV PaEx

1, and therefore for every substance, the slope of the adiabats, P = P (V, S = const.), is steeper than that of the isotherms, P = P (V, T = const.). ∂P For a classical ideal gas, we ﬁnd κ=const.6 and ∂V = − NVkT = − VP . 2 T It thus follows from (3.5.18)

∂P P . (3.5.20) = −κ ∂V S V The solution of this diﬀerential equation is P V κ = const , and with the aid of the equation of state, we then ﬁnd T V κ−1 = const . For a monatomic ideal gas, we have κ = of (3.2.27).

(3.5.21) 3 2 +1 3 2

= 53 , where we have made use

3.6 The First and Second Laws of Thermodynamics 3.6.1 The First and the Second Law for Reversible and Irreversible Processes 3.6.1.1 Quasistatic and in Particular Reversible Processes We recall the formulation of the First and Second Laws of Thermodynamics in Eqns. (3.1.3) and (3.1.5). In the case of reversible transitions between an equilibrium state and a neighboring, inﬁnitesimally close equilibrium state, we have dE = δQ − P dV + µdN

(3.6.1)

with δQ = T dS . 6

(3.6.2)

This is evident for a monatomic classical ideal gas from (3.2.27). For a molecular ideal gas as treated in Chap. 5, the speciﬁc heats are temperature independent only in those temperature regions where particular internal degrees of freedom are completely excited or not excited at all.

104

3. Thermodynamics

Equations (3.6.1) and (3.6.2) are the mathematical formulations of the First and Second Laws. The Second Law in the form of Eq. (3.6.2) holds for reversible (and thus necessarily quasistatic) processes. It is also valid for quasistatic irreversible processes within those subsystems which are in equilibrium at every instant in time and in which only quasistatic transitions from an equilibrium state to a neighboring equilibrium state take place. (An example of this is the thermal equilibration of two bodies via a poor heat conductor (see Sect. 3.6.3.1). The overall system is not in equilibrium, and the process is irreversible. However, the equilibration takes place so slowly that the two bodies within themselves are in equilibrium states at every moment in time). 3.6.1.2 Irreversible Processes For arbitrary processes, the First Law holds in the form given in Eq. (3.1.3 ): dE = δQ + δW + δEN ,

(3.6.1 )

where δQ, δW , and δEN are the quantity of heat transferred, the work performed on the system, and the increase in energy through addition of matter. In order to formulate the Second Law with complete generality, we recall the relation (2.3.4) for the entropy of the microcanonical ensemble and consider the following situation: we start with two systems 1 and 2 which are initially separated and are thus not in equilibrium with each other; their entropies are S1 and S2 . We now bring these two systems into contact. The entropy of this nonequilibrium state is Sinitial = S1 + S2 .

(3.6.3)

Suppose the two systems to be insulated from their environment and their total energy, volume, and particle number to be given by E, V and N . Now the overall system passes into the microcanonical equilibrium state corresponding to these macroscopic values. Owing to the additivity of entropy, the total entropy after equilibrium has been reached is given by ˜1 ) + S2 (E˜2 , V˜2 , N ˜2 ) , S1+2 (E, V, N ) = S1 (E˜1 , V˜1 , N

(3.6.4)

˜1 , V˜1 , N ˜1 (E ˜2 , V˜2 , N ˜2 ) are the most probable values of these quanwhere E tities in the subsystem 1 (2). Since the equilibrium entropy is a maximum (Eq. 2.3.5), the following inequality holds: S1 + S2 = Sinitial

(3.6.5)

˜1 ) + S2 (E˜2 , V˜2 , N ˜2 ) . ≤S1+2 (E, V, N ) = S1 (E˜1 , V˜1 , N Whenever the initial density matrix of the combined systems 1+2 is not already equal to the microcanonical density matrix, the inequality sign holds.

3.6 The First and Second Laws of Thermodynamics

105

We now apply the inequality (3.6.5) to various physical situations. (A) Let an isolated system be in a non-equilibrium state. We can decompose it into subsystems which are in equilibrium within themselves and apply the inequality (3.6.5). Then we ﬁnd for the change ∆S in the total entropy ∆S > 0 .

(3.6.6)

This inequality expresses the fact that the entropy of an isolated systems can only increase and is also termed the Clausius principle. (B) We consider two systems 1 and 2 which are in equilibrium within themselves but are not in equilibrium with each other. Let their entropy changes be denoted by ∆S1 and ∆S2 . From the inequality (3.6.5), it follows that ∆S1 + ∆S2 > 0 .

(3.6.7)

We now assume that system 2 is a heat bath, which is large compared to system 1 and which remains at the temperature T throughout the process. The quantity of heat transferred to system 1 is denoted by ∆Q1 . For system 2, the process occurs quasistatically, so that its entropy change ∆S2 is related to the heat transferred, −∆Q1 , by 1 ∆S2 = − ∆Q1 . T Inserting this into Eq. (3.6.7), we ﬁnd ∆S1 >

1 ∆Q1 . T

(3.6.8)

In all the preceding relations, the quantities ∆S and ∆Q are by no means required to be small, but instead represent simply the change in the entropy and the quantity of heat transferred. In the preceding discussion, we have considered the initial state and as ﬁnal state a state of overall equilibrium. In fact, these inequalities hold also for portions of the relaxation process. Each intermediate step can be represented in terms of equilibrium states with constraints, whereby the limitations imposed by the constraints decrease in the course of time. At the same time, the entropy increases. Thus, for each inﬁnitesimal step in time, the change in entropy of the isolated overall system is given by dS ≥ 0 .

(3.6.6 )

For the physical situation described under B, we have dS1 ≥

1 δQ1 . T

(3.6.8 )

106

3. Thermodynamics

We now summarize the content of the First and Second Laws. The First Law : dE = δQ + δW + δEN

(3.6.9)

Change of energy = heat transferred + work performed + energy change due to transfer of matter; E is a state function. The Second Law : δQ ≤ T dS

(3.6.10)

and S is a state function. a) For reversible changes: δQ = T dS. b) For irreversible changes: δQ < T dS. Notes: (i) The equals sign in Eq. (3.6.10) holds also for irreversible quasistatic processes in those subregions which are in equilibrium in each step of the process (see Sect. 3.6.3.1). (ii) In (3.6.10), we have combined (3.6.6 ) and (3.6.8 ). The situation of the isolated system (3.6.6) is included in (3.6.10), since in this case δQ = 0 (see the example 3.6.3.1). (iii) In many processes, the particle number remains constant (dN = 0). Therefore, we often employ (3.6.9) considering only δQ and δW , without mentioning this expressly each time.

We now wish to apply the Second Law to a process which leads from a state A to a state B as indicated in Fig. 3.10. If we integrate (3.6.10), we obtain B

B dS ≥

A

δQ T

A

and from this, B SB − S A ≥

δQ . T

(3.6.11)

A

For reversible processes, the equals sign holds; for irreversible ones, the inequality. In a reversible process, the state of the system can be completely characterized at each moment in time by a point in the P –V -diagram. In an irreversible process leading from one equilibrium state (possibly with constraints) A to another equilibrium state B, this is not in general the case. This is indicated by the dashed line in Fig. 3.10.

3.6 The First and Second Laws of Thermodynamics

Fig. 3.10. The path of a process connecting two thermodynamic states A and B

107

Fig. 3.11. A cyclic process, represented by a closed curve in the P − V diagram, which leads back to the starting point (B = A), whereby at least to some extent irreversible changes of state occur.

We consider the following special cases: (i) An adiabatic process: For an adiabatic process (δQ = 0), it follows from (3.6.11) that SB ≥ S A

or

∆S ≥ 0 .

(3.6.11 )

The entropy of a thermally isolated system cannot decrease. This statement is more general than Eq. (3.6.6), where completely isolated systems were assumed. (ii) Cyclic processes: For a cyclic process, the ﬁnal state is identical with the initial state, B = A (Fig. 3.11). Then we have SB = SA and and it follows from Eq. (3.6.11) for a cyclic process that the inequality ! δQ (3.6.12) 0≥ T holds, where the line integral is calculated along the closed curve of Fig. 3.11, corresponding to the actual direction of the process. ∗

3.6.2 Historical Formulations of the Laws of Thermodynamics and other Remarks The First Law There exists no perpetual motion machine of the ﬁrst kind (A perpetual motion machine of the ﬁrst kind refers to a machine which operates periodically and functions only as a source of energy). Energy is conserved and heat is only a particular form of energy, or more precisely, energy transfer. The recognition of the fact that heat is only a form of energy and not a unique material which can penetrate all material bodies was the accomplishment of Julius Robert Mayer (a physician, 1814–1878) in 1842.

108

3. Thermodynamics

James Prescott Joule (a brewer of beer) carried out experiments in the years 1843-1849 which demonstrated the equivalence of heat energy and the energy of work 1 cal = 4.1840 × 107 erg = 4.1840 Joule . The First Law was mathematically formulated by Clausius: δQ = dE + P dV . The historical formulation quoted above follows from the First Law, which contains the conservation of energy and the statement that E is a state variable. Thus, if a machine has returned to its initial state, its energy must be the same as before and it can therefore not have given up any energy to its environment. Second Law Rudolf Clausius (1822–1888) in 1850 : Heat can never pass on its own from a cooler reservoir to a warmer one. William Thomson (Lord Kelvin, 1824–1907) in 1851: The impossibility of a perpetual motion machine of the second kind. (A perpetual motion machine of the second kind refers to a periodically operating machine, which only extracts heat from a single reservoir and performs work.) These formulations are equivalent to one another and to the mathematical formulation. Equivalent formulations of the Second Law. The existence of a perpetual motion machine of the second kind could be used to remove heat from a reservoir at the temperature T1 . The resulting work could then be used to heat a second reservoir at the higher temperature T2 . The correctness of Clausius’ statement thus implies the correctness of Kelvin’s statement. If heat could ﬂow from a colder bath to a warmer one, then one could use this heat in a Carnot cycle (see Sect. 3.7.2) to perform work, whereby part of the heat would once again be taken up by the cooler bath. In this overall process, only heat would be extracted from the cooler bath and work would be performed. One would thus have a perpetual motion machine of the second kind. The correctness of Kelvin’s statement thus implies the correctness of Clausius’ statement. The two verbal formulations of the Second Law, that of Clausius and that of Kelvin, are thus equivalent. It remains to be demonstrated that Clausius’ statement is equivalent to the diﬀerential form of the Second Law (Eq. 3.6.10). To this end, we note that it will be shown in Sect. 3.6.3.1 from (3.6.10) that heat passes from a warmer reservoir to a cooler one. Clausius’ statement follows from (3.6.10). Now we must only demonstrate that the relation (3.6.10) follows from Clausius’ statement. This can be seen as follows: if instead of (3.6.10), conversely T dS < δQ would hold, then it would follow form the consideration of the quasistatic temperature equilibration that heat would be transported from a cooler to a warmer bath; i.e. that Clausius’ statement is false. The correctness of Clausius’ statement thus implies the correctness of the mathematical formulation of the Second Law (3.6.10).

3.6 The First and Second Laws of Thermodynamics

109

All the formulations of the Second Law are equivalent. We have included these historical considerations here because precisely their verbal formulations show the connection to everyday consequences of the Second Law and because this type of reasoning is typical of thermodynamics.

The Zeroth Law When two systems are in thermal equilibrium with a third system, then they are in equilibrium with one another. Proof within statistical mechanics: Systems 1, 2, and 3. Equilibrium of 1 with 3 implies that T1 = T3 and that of 2 with 3 that T2 = T3 ; it follows from this that T1 = T2 , i.e. 1 and 2 are also in equilibrium with one another. The considerations for the pressure and the chemical potential are exactly analogous. This fact is of course very important in practice, since it makes it possible to determine with the aid of thermometers and manometers whether two bodies are at the same temperature and pressure and will remain in equilibrium or not if they are brought into contact. The Third Law The Third Law (also called Nernst’s theorem) makes statements about the temperature dependence of thermodynamic quantities in the limit T → 0; it is discussed in the Appendix A.1. Its consequences are not as far-reaching as those of the First and Second Laws. The vanishing of speciﬁc heats as T → 0 is a direct result of quantum mechanics. In this sense, its postulation in the era of classical physics can be regarded as visionary.

3.6.3 Examples and Supplements to the Second Law We now give a series of examples which clarify the preceding concepts and general results, and which have also practical signiﬁcance. 3.6.3.1 Quasistatic Temperature Equilibration We consider two bodies at the temperatures T1 and T2 and with entropies S1 and S2 . These two bodies are connected by a poor thermal conductor and are insulated from their environment (Fig. 3.12). The two temperatures are diﬀerent: T1 = T2 ; thus, the two bodies are not in equilibrium with each other. Since the thermal conductor has a poor conductivity, all energy transfers occur slowly and each subsystem is in thermal equilibrium at each moment in time. Therefore, for a heat input δQ to body 1 and thus the equal but opposite heat transfer −δQ from body 2, the Second Law applies to both

110

3. Thermodynamics

subsystems in the form

dS1 =

δQ , T1

dS2 = −

δQ . T2

(3.6.13)

Fig. 3.12. Quasistatic temperature equilibration of two bodies connected by a poor conductor of heat

For the overall system, we have dS1 + dS2 > 0 ,

(3.6.14)

since the total entropy increases during the transition to the equilibrium state. If we insert (3.6.13) into (3.6.14), we obtain

1 1 δQ >0. (3.6.15) − T1 T2 We take T2 > T1 ; then it follows from (3.6.13) that δQ > 0, i.e. heat is transferred from the warmer to the cooler container. We consider here the diﬀerential substeps, since the temperatures change in the course of the process. The transfer of heat continues until the two temperatures have equalized; the total amount of heat transferred from 2 to 1, δQ, is positive. Also in the case of a non-quasistatic temperature equilibration, heat is transferred from the warmer to the cooler body: if the two bodies mentioned above are brought into contact (again, of course, isolated from their environment, but without the barrier of a poor heat conductor), the ﬁnal state is the same as in the case of the quasistatic process. Thus also in the nonquasistatic temperature equilibration, heat has passed from the warmer to the cooler body. 3.6.3.2 The Joule–Thomson Process The Joule–Thomson process consists of the controlled expansion of a gas (cf. Fig. 3.13). Here, the stream of expanding gas is limited by a throttle valve. The gas volume is bounded to the left and the right of the throttle by the two sliding pistons S1 and S2 , which produce the pressures P1 and P2 in the left and right chambers, with P1 > P2 . The process is assumed to occur adiabatically, i.e. δQ = 0 during the entire process. In the initial state (1), the gas in the left-hand chamber has the volume V1 and the energy E1 . In the ﬁnal state, the gas is entirely in the right-hand

3.6 The First and Second Laws of Thermodynamics

111

Fig. 3.13. A Joule–Thomson process, showing the sliding pistons S1 and S2 and the throttle valve T

chamber and has a volume V2 and energy E2 . The left piston performs work on the gas, while the gas performs work on the right piston and thus on the environment. The diﬀerence of the internal energies is equal to the total work performed on the system: 2 E2 − E1 =

2 dE =

1

0 δW =

1

V2 dV1 (−P1 ) +

V1

dV2 (−P2 ) 0

= P1 V1 − P2 V2 . From this it follows that the enthalpy remains constant in the course of this process: H 2 = H1 ,

(3.6.16)

where the deﬁnition Hi = Ei + Pi Vi was used. For cryogenic engineering it is important to know whether the gas is cooled by the controlled expansion. This is determined by the Joule–Thomson coeﬃcient: ∂H ∂S

T ∂P T ∂V +V ∂T ∂P T ∂T P − V T ∂S = − ∂H = − = . ∂P H CP T ∂T P ∂T P In the rearrangement, we have used (3.2.21), dH = T dS + V dP , and the Maxwell relation (3.2.10). Inserting the thermal expansion coefﬁcient α, we ﬁnd the following expression for the Joule–Thomson coeﬃcient:

∂T V = (T α − 1) . (3.6.17) ∂P H CP For an ideal gas, α = T1 ; in this case, there is no change in the temperature on expansion. For a real gas, either cooling or warming can occur. When α > T1 , the expansion leads to a cooling of the gas (positive Joule–Thomson eﬀect). When α < T1 , then the expansion gives rise to a warming (negative Joule–Thomson eﬀect). The limit between these two eﬀects is deﬁned by the inversion curve, which is given by α=

1 . T

(3.6.18)

112

3. Thermodynamics

We shall now calculate the inversion curve for a van der Waals gas, beginning with the van der Waals equation of state (Chap. 5) P =

a kT − v − b v2

,

v=

V . N

(3.6.19)

We diﬀerentiate the equation of state with respect to temperature at constant pressure

kT k ∂v 2a ∂v − 0= + 3 . v − b (v − b)2 ∂T P v ∂T P In this expression, we insert the condition (3.6.18)

1 1 ∂v = α≡ v ∂T P T ∂v k 1 and thereby obtain 0 = kv − v−b + 2a for ∂T v 3 T (v − b). Using the van-derP Waals equation again, we ﬁnally ﬁnd for the inversion curve b a 2a 0=− P + 2 + 3 (v − b) , v v v that is P =

2a 3a − 2 bv v

.

(3.6.20)

In the limit of low density, we can neglect the second term in (3.6.20) and the inversion curve is then given by P =

2a kTinv = bv v

,

Tinv =

2a = 6.75 Tc . bk

(3.6.21)

Here, Tc is the critical temperature which follows from the van der Waals equation (5.4.13). For temperatures which are higher than the inversion temperature Tinv , the Joule–Thomson eﬀect is always negative. The inversion temperature and other data for some gases are listed in Table I.4 in the Appendix. The change in entropy in the Joule–Thomson process is determined by

∂S V (3.6.22) =− , ∂P H T as can be seen using dH = T dS + V dP = 0. Since the pressure decreases, we obtain for the entropy change dS > 0, although δQ = 0. The Joule–Thomson process is irreversible, since its initial state with diﬀering pressures in the two chambers is clearly not an equilibrium state. The complete inversion curve from the van der Waals theory is shown in Fig. 3.14a,b. Within the inversion curve, the expansion leads to cooling of the gas.

3.6 The First and Second Laws of Thermodynamics

(a) The inversion curve for the Joule– Thomson eﬀect (upper solid curve). The isotherm is for T = 6.75 Tc (dotdashed curve). The shaded region is excluded, since in this region, the vapor and liquid phases are always both present.

113

(b) The inversion curve in the P -T diagram.

Fig. 3.14. The inversion curve for the Joule–Thomson eﬀect

3.6.3.3 Temperature Equilibration of Ideal Gases We will now investigate the thermal equilibration of two monatomic ideal gases (a and b). Suppose the two gases to be separated by a sliding piston and insulated from their environment (Fig. 3.15).

Fig. 3.15. The thermal equilibration of two ideal gases

The pressure of the two gases is taken to be equal, Pa = Pb = P , while their temperatures are diﬀerent in the initial state, Ta = Tb . Their volumes and particle numbers are given by Va , Vb and Na , Nb , so that the total volume and total particle number are V = Va + Vb and N = Na + Nb . The entropy of the initial state is given by

# "

Va Vb 5 5 S = Sa + S b = k N a + Nb . (3.6.23) + log + log 2 Na λ3a 2 Nb λ3b

114

3. Thermodynamics

The temperature after the establishment of equilibrium, when the temperatures of the two systems must approach the same value according to Chap. 2, will be denoted by T . Owing to the conservation of energy, we have 32 N kT = 32 Na kTa + 32 Nb kTb , from which it follows that T =

Na T a + Nb T b = c a T a + cb T b , Na + Nb

(3.6.24)

where we have introduced the ratio of the particle numbers, ca,b = recall the deﬁnition of the thermal wavelengths h λa,b = , 2πma,b kTa,b

Na,b N .

We

h . λa,b = 2πma,b kT

The entropy after the establishment of equilibrium is " # " # Va Vb 5 5 + log + log S = kNa + kN , b 2 Na λ3 2 Nb λ3 a b so that for the entropy increase, we ﬁnd S − S = kNa log

Va λ3a Vb λ3b + kN log . b Va λ3 Vb λ3 a b

(3.6.25)

We shall also show that the pressure remains unchanged. To this end, we add the two equations of state of the subsystems before the establishment of thermal equilibrium Va P = Na kTa ,

Vb P = Nb kTb

(3.6.26a)

and obtain using (3.6.24) the expression (Va + Vb )P = (Na + Nb )kT .

(3.6.26b)

From the equations of state of the two subsystems after the establishment of equilibrium P = Na,b kT Va,b

(3.6.26a)

with Va + Vb = V , it follows that V P = (Na + Nb )kT ,

(3.6.26b )

i.e. P = P . Incidentally, in (3.6.24) and (3.6.26b ), the fact is used that the two monatomic gases have the same speciﬁc heat. Comparing (3.6.26b) and (3.6.26b ), we ﬁnd the volume ratios Va,b T = . Va,b Ta,b

3.6 The First and Second Laws of Thermodynamics

115

From this we obtain S − S =

T Na +Nb 5 k log Na Nb , 2 Ta Tb

which ﬁnally yields S − S =

T c a T a + cb T b 5 5 kN log ca cb = kN log . 2 Ta Tb 2 Taca Tbcb

(3.6.27)

Due to the convexity of the exponential function, we have Taca Tbcb = exp(ca log Ta + cb log Tb ) ≤ ca exp log Ta + cb exp log Tb = c a T a + cb T b = T , and thus it follows from (3.6.27) that S − S ≥ 0, i.e. the entropy increases on thermal equilibration. Note: Following the equalization of temperatures, in which heat ﬂows from the warmer to the cooler parts of the system, the volumes are given by: Va =

Na V , Na + Nb

Vb =

Nb V . Na + Nb

Together with Eq. (3.6.26b), this gives Va /Va = T /Ta and Vb /Vb = T /Tb . The energy which is put into subsystem a is ∆Ea = 32 Na k(T − Ta ). The enthalpy increase in subsystem a is given by ∆Ha = 52 Na k(T − Ta ). Since the process is isobaric, we have ∆Qa = ∆Ha . The work performed on subsystem a is therefore equal to ∆Wa = ∆Ea − ∆Qa = −Na k(T − Ta ) . The warmer subsystem gives up heat. Since it would then be too rareﬁed for the pressure P , it will be compressed, i.e. it takes on energy through the work performed in this compression.

3.6.3.4 Entropy of Mixing We now consider the process of mixing of two diﬀerent ideal gases with the masses ma and mb . The temperatures and pressures of the gases are taken to be the same, Ta = Tb = T ,

Pa = Pb = P .

From the equations of state, Va P = Na kT ,

Vb P = Nb kT

116

3. Thermodynamics

Fig. 3.16. The mixing of two gases

it follows that Nb Na + Nb Na = = . Va Vb Va + Vb Using the thermal wavelength λa,b = √

h 2πma,b kT

, the entropy when the gases

are separated by a partition is given by

# "

Va Vb 5 5 + log + log + Nb . (3.6.28) S = Sa + S b = k N a 2 Na λ3a 2 Nb λ3b After removal of the partition and mixing of the gases, the value of the entropy is

# "

Va + Vb Va + Vb 5 5 S = k Na + log + log + N . (3.6.29) b 2 Na λ3a 2 Nb λ3b From Eqns. (3.6.28) and (3.6.29), we obtain the diﬀerence in the entropies: S − S = k log

(Na + Nb )Na +Nb = k(Na + Nb ) log NaNa NbNb

1 ccaa ccbb

>0,

where we have used the relative particle numbers ca,b =

Na,b . Na + Nb

Since the argument of the logarithm is greater than 1, we ﬁnd that the entropy of mixing is positive, e.g. Na = Nb ,

S − S = 2kNa log 2 .

The entropy of mixing always occurs when diﬀerent gases interdiﬀuse, even when they consist of diﬀerent isotopes of the same element. When, in contrast, the gases a and b are identical, the value of the entropy on removing

3.6 The First and Second Laws of Thermodynamics

117

the partition is Sid

" = k(Na + Nb )

Va + Vb 5 + log 2 (Na + Nb )λ3

# (3.6.29 )

and λ = λa = λb . We then have Sid − S = k log

(Va + Vb )Na +Nb NaNa NbNb (Na + Nb )Na +Nb VaNa VbNb

=0

making use of the equation of state; therefore, no entropy of mixing occurs. This is due to the factor 1/N ! in the basic phase-space volume element in Eqns. (2.2.2) and (2.2.3), which results from the indistinguishability of the particles. Without this factor, Gibbs’ paradox would occur, i.e. we would ﬁnd a positive entropy of mixing for identical gases, as mentioned following Eq. (2.2.3). ∗

3.6.3.5 Heating a Room

Finally, we consider an example, based on one given by Sommerfeld.7 A room is to be heated from 0◦ C to 20◦ C. What quantity of heat is required? How does the energy content of the room change in the process? If air can leave the room through leaks around the windows, for example, then the process is isobaric, but the number of air molecules in the room will decrease in the course of the heating process. The quantity of heat required depends on the increase in temperature through the relation δQ = CP dT ,

(3.6.30)

where CP is the heat capacity at constant pressure. In the temperature range that we are considering, the rotational degrees of freedom of oxygen, O2 , and nitrogen, N2 , are excited (see Chap. 5), so that under the assumption that air is an ideal gas, we have CP =

7 Nk , 2

(3.6.31)

where N is the overall number of particles. The total amount of heat required is found by integrating (3.6.31) between the initial and ﬁnal temperatures, T1 and T2 : ZT2 Q=

dT CP .

(3.6.32)

T1

If we initially neglect the temperature dependence of the particle number, and thus the heat capacity (3.6.31), we ﬁnd Q = CP (T2 − T1 ) = 7

7 N1 k(T2 − T1 ) . 2

(3.6.32 )

A. Sommerfeld, Thermodynamics and Statistical Mechanics: Lectures on Theoretical Physics, Vol. V, (Academic Press, New York, 1956)

118

3. Thermodynamics

Here, we have denoted the particle number at T1 as N1 and taken it to be constant. Equation (3.6.32 ) will be a good approximation, as long as T2 ≈ T1 . If we wish to take into account the variation of the particle number within the room (volume V ), we have to replace N in Eq. (3.6.31) by N from the equation of state, N = P V /kT , and it follows that ZT2 dT

Q=

7 T2 7 PV 7 T2 = P V log = N1 kT1 log . 2 T 2 T1 2 T1

(3.6.33)

T1

With log

T2 T1

=

T2 T1

−1+O

“`

T2 T1

´2 ” −1 , we obtain from (3.6.33) for small temperature

diﬀerences the approximate formula (3.6.32 ) « „ 20 7 T2 − T1 3.5 × 2 11 dyn 106 (V m3 ) = 3.5 106 Q = PV = 10 erg (V m3 ) 2 T1 cm2 273 2.73 = 6 kcal (V m3 ). It is instructive to compute the change in the energy content of the room on heating, taking into account the fact that the rotational degrees of freedom are fully excited, T Θr (see Chap. 5). Then the internal energy before and after the heating procedure is 1 5 Ei = Ni kTi − Ni kΘr + Ni εel 2 6 5 1 E2 − E1 = k(N2 T2 − N1 T1 ) − P V Θr 2 6

„

1 1 − T2 T1

« + PV

εel k

„

1 1 − T2 T1

« .

(3.6.34) The ﬁrst term is exactly zero, and the second one is positive; the third, dominant term is negative. The internal energy of the room actually decreases upon heating. The heat input is given up to the outside world, in order to increase the temperature in the room and thus the average kinetic energy of the remaining gas molecules. Heating with a ﬁxed particle number (a hermetically sealed room) requires a quantity of heat Q = CV (T2 − T1 ) ≡ 52 N1 k(T2 − T1 ). For small temperature diﬀerences T2 − T1 , it is then more favorable ﬁrst to heat the room to the ﬁnal temperature T2 and then to allow the pressure to decrease. The point of intersection of the two curves (P, N ) constant and P constant, with N variable (Fig. 3.17) at T20 is determined by T20 − T1 T1 log

T20 T1

=

CP . CV

A numerical estimate yields T20 = 1.9 T1 for the point of intersection in Fig. 3.17, i.e. at T1 = 273 K, T20 = 519 K. For any process of space heating, isolated heating is more favorable. The diﬀerence in the quantities of heat required is ∆Q ≈ (CP − CV )(T2 − T1 ) =

1 6 kcal (V m3 ) = 1.7 kcal (V m3 ) . 3.5

3.6 The First and Second Laws of Thermodynamics

119

Fig. 3.17. The quantity of heat required for space heating: as an isobaric process (solid curve), isochore (dashed curve), or isobaric neglecting the decrease in particle number (dot-dashed curve).

All of the above considerations have neglected the heat capacity of the walls. They are applicable to a rapid heating of the air. The change in pressure on heating a ﬁxed amount of air by 20◦ C is, however, δT 20 δP = ∼ ∼ 0.07 , i.e. δP ∼ 0.07 bar ∼ 0.07 kg/cm2 ∼ 700 kg/m2 ! P T 273

∗

3.6.3.6 The Irreversible, Quasistatic Gay-Lussac Experiment

We recall the diﬀerent versions of the Gay-Lussac experiment. In the irreversible form, we have ∆Q = 0 and ∆S > 0 (3.5.1). In the reversible case (isothermal or adiabatic), using (3.5.9) and (3.5.14), the corresponding relation for reversible processes is fulﬁlled. It is instructive to carry out the Gay-Lussac experiment in a quasistatic, irreversible fashion. One can imagine that the expansion does not take place suddenly, but instead is slowed by friction of the piston to the point that the gas always remains in equilibrium. The frictional heat can then either be returned to the gas or given up to the environment. We begin by treating the ﬁrst possibility. Since the frictional heat from the piston is returned to the gas, there is no change in the environment after each step in the process. The ﬁnal result corresponds to the situation of the usual Gay-Lussac experiment. For the moment, we denote the gas by an index 1 and the piston, which initially takes up the frictional heat, by 2. Then the work which the gas performs on expansion by the volume change dV is given by δW1→2 = P dV . This quantity of energy is passed by the piston to 1: δQ2→1 = δW1→2 . The energy change of the gas is dE = δQ2→1 − δW1→2 = 0. Since the gas is always in equilibrium at each instant, the relation dE = T dS − P dV also holds and thus we have for the entropy increase of the gas: T dS = δQ2→1 > 0 . The overall system of gas + piston transfers no heat to the environment and also performs no work on the environment, i.e. δQ = 0 and δW = 0. Since the entropy

120

3. Thermodynamics

of the piston remains the same (for simplicity, we consider an ideal gas, whose temperature does not change), it follows that T dS > δQ. Now we consider the situation that the frictional heat is passed to the outside world. This means that δQ2→1 = 0 and thus T dS = 0, also dS = 0. The total amount of heat given oﬀ to the environment (heat loss δQL ) is δQL = δW1→2 > 0 . Here, again, the inequality −δQL < T dS is fulﬁlled, characteristic of the irreversible process. The ﬁnal state of the gas corresponds to that found for the reversible adiabatic process. There, we found ∆S = 0, Q = 0, and W > 0. Now, ∆S = 0, while QL > 0 and is equal to the W of the adiabatic, reversible process, from Eq. (3.5.12).

3.6.4 Extremal Properties In this section, we derive the extremal properties of the thermodynamic potentials. From these, we shall obtain the equilibrium conditions for multicomponent systems in various phases and then again the inequalities (3.3.5) and (3.3.6). We assume in this section that no particle exchange with the environment occurs, i.e. dNi = 0, apart from chemical reactions within the system. Consider the system in general not yet to be in equilibrium; then for example in an isolated system, the state is not characterized solely by E, V, and Ni , but instead we need additional quantities xα , which give e.g. the concentrations of the independent components in the diﬀerent phases or the concentrations of the components between which chemical reactions occur. Another situation not in equilibrium is that of spatial inhomogeneities.8 We now however assume that equilibrium with respect to the temperature and pressure is present, i.e. that the system is characterized by uniform (but variable) T and P values. This assumption may be relaxed somewhat. For the following derivation, it suﬃces that the system likewise be at the pressure P at the stage when work is being performed by the pressure P , and when it is exchanging heat with a reservoir at the temperature T , that it be at the temperature T . (This permits e.g. inhomogeneous temperature distributions during a chemical reaction in a subsystem.) Under these conditions, the First Law, Eq (3.6.9), is given by dE = δQ − P dV .

8

As an example, one could imagine a piece of ice and a solution of salt in water at P = 1 atm and −5◦ C. Each component of this system is in equilibrium within itself. If one brings them into contact, then a certain amount of the ice will melt and some of the NaCl will diﬀuse into the ice until the concentrations are such that the ice and the solution are in equilibrium (see the section on eutectics). The initial state described here – a non-equilibrium state – is a typical example of an inhibited equilibrium. As long as barriers impede (inhibit) particle exchange, i.e. so long as only energy and volume changes are possible, this inhomogeneous state can be described in terms of equilibrium thermodynamics.

3.6 The First and Second Laws of Thermodynamics

121

Our starting point is the Second Law, (3.6.10): dS ≥

δQ . T

(3.6.35)

We insert the First Law into this equation and obtain dS ≥

1 (dE + P dV ) . T

(3.6.36a)

We have used the principle of energy conservation from equilibrium thermodynamics here, which however also holds in non-equilibrium states. The change in the energy is equal to the heat transferred plus the work performed. The precondition is that during the process a particular, well-deﬁned pressure is present. If E, V are held constant, then according to Eq. (3.6.36a), we have dS ≥ 0

for E, V ﬁxed ;

(3.6.36b)

that is, an isolated system tends towards a maximum of the entropy. When a non-equilibrium state is characterized by a parameter x, its entropy has the form indicated in Fig. 3.18. It is maximal for the equilibrium value x0 . The parameter x could be e.g. the volume of the energy of a subsystem of the isolated system considered. One refers to a process or variation as virtual – that is, possible in principle – if it is permitted by the conditions of a system. An inhomogeneous distribution of the energies of the subsystems with constant total energy would, to be sure, not occur spontaneously, but it is possible. In equilibrium, the entropy is maximal with respect to all virtual processes. We now consider the free enthalpy or Gibbs’ free energy, G = E − TS + PV ,

(3.6.37)

which we deﬁne with Eq. (3.6.37) for non-equilibrium states just as for equilibrium states. For the changes in such states, we ﬁnd from (3.6.36a) that the inequality dG ≤ −SdT + V dP

(3.6.38a)

holds. For the case that T and P are held constant, it follows from (3.6.38a) that dG ≤ 0

for T and P ﬁxed,

(3.6.38b)

i.e. the Gibbs’ free energy G tends towards a minimum. In the neighborhood of the minimum (Fig. 3.19), we have for a virtual (in thought only) variation δG = G(x0 + δx) − G(x0 ) =

1 G (x0 )(δx)2 . 2

(3.6.39)

122

3. Thermodynamics

Fig. 3.18. The entropy as a function of a parameter x, with the equilibrium value x0 .

Fig. 3.19. The free enthalpy as a function of a parameter.

The ﬁrst-order terms vanish, therefore in ﬁrst order we ﬁnd for δx: δG = 0

for T and P ﬁxed.9

(3.6.38c)

One terms this condition stationarity. Since G is minimal at x0 , we ﬁnd G (x0 ) > 0 .

(3.6.40)

Analogously, one can show for the free energy (Helmholtz free energy) F = E − T S and for the enthalpy H = E + P V that: dF ≤ −SdT − P dV

(3.6.41a)

dH ≤ T dS + V dP .

(3.6.42a)

and

These potentials also tend towards minimum values at equilibrium under the condition that their natural variables are held constant: dF ≤ 0

for T and V ﬁxed

(3.6.41b)

dH ≤ 0

for S and P ﬁxed .

(3.6.42b)

and

As conditions for equilibrium, it then follows that δF = 0

for T and V ﬁxed

(3.6.41c)

δH = 0

for S and P ﬁxed .

(3.6.42c)

and

9

This condition plays an important role in physical chemistry, since in chemical processes, the pressure and the temperature are usually ﬁxed.

3.6 The First and Second Laws of Thermodynamics

123

∗

3.6.5 Thermodynamic Inequalities Derived from Maximization of the Entropy We consider a system whose energy is E and whose volume is V . We decompose this system into two equal parts and investigate a virtual change of the energy and the volume of subsystem 1 by δE1 and δV1 . Correspondingly, the values for subsystem 2 change by −δE1 and −δV1 . The overall entropy before the change is

E V E V S(E, V ) = S1 , + S2 , . (3.6.43) 2 2 2 2 Therefore, the change of the entropy is given by

E E V V δS = S1 + δE1 , + δV1 + S2 − δE1 , − δV1 − S(E, V ) 2 2 2 2

∂S2 ∂S1 ∂S2 ∂S1 δE1 + δV1 − − = ∂E1 ∂E2 ∂V1 ∂V2

1 ∂ 2 S1 ∂ 2 S2 1 ∂ 2 S1 ∂ 2 S2 2 2 (δE (δV1 ) + + ) + + 1 2 ∂E12 ∂E22 2 ∂V12 ∂V22

2 ∂ 2 S2 ∂ S1 δE1 δV1 + . . . + + ∂E1 ∂V1 ∂E2 ∂V2 (3.6.44) From the stationarity of the entropy, δS = 0, it follows that the terms which are linear in δE1 and δV1 must vanish. This means that in equilibrium the temperature T and the pressure P of the subsystems must be equal T1 = T2 , P1 = P2 ;

(3.6.45a)

this is a result that is already familiar to us from equilibrium statistics. If we permit also virtual variations of the particle numbers, δN1 and −δN1 , in the subsystems 1 and 2, then an additional term enters the second ∂S1 ∂S2 line of (3.6.44): ∂N − ∂N2 δN1 ; and one obtains as an additional condition 1 for equilibrium the equality of the chemical potentials: µ1 = µ2 .

(3.6.45b)

Here, the two subsystems could also consist of diﬀerent phases (e.g. solid and liquid). We note that the second derivatives of S1 and S2 in (3.6.44) are both to be taken at the values E/2, V /2 and they are therefore equal. In the equilibrium state, the entropy is maximal, according to (3.6.36b). From this it follows that the coeﬃcients of the quadratic form (3.6.44) obey the two conditions

124

3. Thermodynamics

∂ 2 S1 ∂ 2 S2 = ≤0 2 ∂E1 ∂E22

(3.6.46a)

and ∂ 2 S1 ∂ 2 S1 − ∂E12 ∂V12

∂ 2 S1 ∂E1 ∂V1

2 ≥0.

(3.6.46b)

We now leave oﬀ the index 1 and rearrange the left side of the ﬁrst condition:

1 ∂T ∂2S 1 = =− 2 . (3.6.47a) 2 ∂E ∂E V T CV The left side of the second condition, Eq. (3.6.46b), can be represented by a Jacobian, and after rearrangement, ∂S ∂S ∂ T1 , PT ∂ T1 , PT ∂ (T, V ) ∂ ∂E , ∂V = = ∂ (E, V ) ∂ (E, V ) ∂ (T, V ) ∂ (E, V ) (3.6.47b)

1 1 1 ∂P = 3 . =− 3 T ∂V T CV T V κT CV If we insert the expressions (3.6.47a,b) into the inequalities (3.6.46a) and (3.6.46b), we obtain CV ≥ 0 ,

κT ≥ 0 ,

(3.6.48a,b)

which expresses the stability of the system. When heat is given up, the system becomes cooler. On compression, the pressure increases. Stability conditions of the type of (3.6.48a,b) are expressions of Le Chatelier’s principle: When a system is in a stable equilibrium state, every spontaneous change in its parameter leads to reactions which drive the system back towards equilibrium. The inequalities (3.6.48a,b) were already derived in Sect. 3.3 on the basis of the positivity of the mean square deviations of the particle number and the energy. The preceding derivation relates them within thermodynamics to the stationarity of the entropy. The inequality CV ≥ 0 guarantees thermal stability. If heat is transferred to part of a system, then its temperature increases and it releases heat to its surroundings, thus again decreasing its temperature. If its speciﬁc heat were negative, then the temperature of the subsystem would decrease on input of heat, and more heat would ﬂow in from its surroundings, leading to a further temperature decrease. The least input of heat would set oﬀ an instability. The inequality κT ≥ 0 guarantees mechanical stability. A small expansion of the volume of a region results in a decrease in its pressure, so that the surroundings, at higher pressure, compress the region again. If however κT < 0, then the pressure would increase in the region and the volume element would continue to expand.

3.7 Cyclic Processes

125

3.7 Cyclic Processes The analysis of cyclic processes played an important role in the historical development of thermodynamics and in the discovery of the Second Law of thermodynamics. Even today, their understanding is interesting in principle and in addition, it has eminent practical signiﬁcance. Thermodynamics makes statements concerning the eﬃciency of cyclic processes (periodically repeating processes) of the most general kind, which are of importance both for heat engines and thus for the energy economy, as well as for the energy balance of biological systems. 3.7.1 General Considerations In cyclic processes, the working substance, i.e. the system, returns at intervals to its initial state (after each cycle). For practical reasons, in the steam engine, and in the internal combustion engine, the working substance is replenished after each cycle. We assume that the process takes place quasistatically; thus, we can characterize the state of the system by two thermodynamic variables, e.g. P and V or T and S. The process can be represented as a closed curve in the P -V or the T -S plane (Fig. 3.20).

Fig. 3.20. A cyclic process: (a) in the P -V diagram; (b) in the T -S diagram

The work which is performed during one cycle is given by the line integral along the closed curve ! W = −W = P dV = A , (3.7.1) which is equal to the enclosed area A within the curve representing the cyclic process in the P -V diagram. The heat taken up during one cycle is given by ! Q = T dS = A . (3.7.2) Since the system returns to its initial state after a cycle, thus in particular the internal energy of the working substance is unchanged, it follows from the principle of conservation of energy that Q=W .

(3.7.3)

126

3. Thermodynamics

The heat taken up is equal to the work performed on the surroundings. The direction of the cyclic path and the area in the P -V and T -S diagrams are thus the same. When the cyclic process runs in a clockwise direction (righthanded process), then ∨

Q=W >0

(3.7.4a)

and one refers to a work engine. In the case that the process runs counterclockwise (left-handed process), we have Q=W T1 , and in between it is insulated. The motion of the piston is shown in Fig. 3.22.

Fig. 3.21. A Carnot cycle in (a) the P -V diagram and (b) the T -S diagram

Fig. 3.22. The sequence of the Carnot cycle

3.7 Cyclic Processes

127

1. Isothermal expansion: the system is brought into contact with the warmer heat bath at the temperature T2 . The quantity of heat Q2 = T2 (S2 − S1 )

(3.7.5a)

is taken up from the bath, while at the same time, work is performed on the surroundings. 2. Adiabatic expansion: the system is thermally insulated. Through an adiabatic expansion, work is performed on the outer world and the working substance cools from T2 to the temperature T1 . 3. Isothermal compression: the working substance is brought into thermal contact with the heat bath at temperature T1 and through work performed on it by the surroundings, it is compressed. The quantity of heat “taken up” by the working substance Q1 = T1 (S1 − S2 ) < 0

(3.7.5b)

is negative. That is, the quantity |Q1 | of heat is given up to the heat bath. 4. Adiabatic compression: employing work performed by the outside world, the now once again thermally insulated working substance is compressed and its temperature is thereby increased to T2 . After each cycle, the internal energy remains the same; therefore, the total work performed on the surroundings is equal to the quantity of heat taken up by the system, Q = Q1 + Q2 ; thus W = Q = (T2 − T1 )(S2 − S1 ) .

(3.7.5c)

The thermal eﬃciency (= work performed/heat taken up from the warmer heat bath) is deﬁned as η=

W . Q2

(3.7.6a)

For the Carnot machine, we obtain ηC = 1 −

T1 , T2

(3.7.6b)

where the index C stands for Carnot. We see that ηC < 1. The general validity of (3.7.6a) cannot be too strongly emphasized; it holds for any kind of working substance. Later, we shall show that there is no cyclic process whose eﬃciency is greater than that of the Carnot cycle. The Inverse Carnot Cycle Now, we consider the inverse Carnot cycle, in which the direction of the operations is counter-clockwise (Fig. 3.23). In this case, for the quantities of heat taken up from baths 2 and 1, we ﬁnd

128

3. Thermodynamics

Fig. 3.23. The inverse Carnot cycle

Q2 = T2 (S1 − S2 ) < 0 Q1 = T1 (S2 − S1 ) > 0 .

(3.7.7a,b)

The overall quantity of heat taken up by the system, Q, and the work performed on the system, W , are then given by Q = (T1 − T2 )(S2 − S1 ) = −W < 0 .

(3.7.8)

Work is performed by the outside world on the system. The warmer reservoir is heated further, and the cooler one is cooled. Depending on whether the purpose of the machine is to heat the warmer reservoir or to cool the colder one, one deﬁnes the heating eﬃciency or the cooling eﬃciency. The heating eﬃciency (= the heat transferred to bath 2/work performed) is H ηC =

T2 −Q2 = >1. W T2 − T1

(3.7.9)

H Since ηC > 1, this represents a more eﬃcient method of heating than the direct conversion of electrical energy or other source of work into heat (this type of machine is called a heat pump). The formula however also shows that the use of heat pumps is reasonable only as long as T2 ≈ T1 ; when the temperature of the heat bath (e.g. the Arctic Ocean) T1 T2 , it follows that |Q2 | ≈ |W|, i.e. it would be just as eﬀective to convert the work directly into heat. The cooling eﬃciency (= the quantity of heat removed from the cooler reservoir/work performed) is K ηC =

T1 Q1 = . W T2 − T1

(3.7.10)

For large-scale technical cooling applications, it is expedient to carry out the cooling process in several steps, i.e. as a cascade. 3.7.3 General Cyclic Processes We now take up a general cyclic process (Fig. 3.24), in which heat exchange with the surroundings can take place at diﬀerent temperatures, not necessarily only at the maximum and minimum temperature. We shall show, that

3.7 Cyclic Processes

Fig. 3.24. The general cyclic process

129

Fig. 3.25. The idealized (full curve) and real (dashed curve) sequence of the Carnot cycle

the eﬃciency η obeys the inequality η ≤ ηC ,

(3.7.11)

where ηC the eﬃciency of a Carnot cycle operating between the two extreme temperatures. We decompose the process into sections with heat uptake (δQ > 0) and heat output (δQ < 0), and also allow irreversible processes to take place W = Q = ∨ δQ = δQ + δQ = Q2 + Q1 . δQ>0

δQ0

0

(3.7.12)

δQ 0. In this connection we recall the process of isobaric heating discussed in Sect. 3.8.1. In the coexistence region, the temperature T remains constant, since the heat put into the system is consumed by the phase transition. From (3.8.10), QL = T ∆S > 0, it follows that ∆S > 0. This can also be read oﬀ Fig. 3.34b, whose general form results from the concavity of G and ∂G ∂T P = −S < 0. 3.8.2.2 Example Applications of the Clausius–Clapeyron Equation: We now wish to give some interesting examples of the application of the Clausius–Clapeyron equation. (i) Liquid → gaseous: since, according to the previous considerations, ∆S > 0 and the speciﬁc volume of the gas is larger than that of the liquid, ∆V > 0, 0 it follows that dP dT > 0, i.e. the boiling temperature increases with increasing pressure (Table I.5 and Figs. 3.28(b) and 3.29(b)). Table I.6 contains the heats of vaporization of some substances at their boiling points under standard pressure, i.e. 760 Torr. Note the high value for water. (ii) Solid → liquid: in the transition to the high-temperature phase, we have dT always ∆S > 0. Usually, ∆V > 0; then it follows that dP > 0. In the dT case of water, ∆V < 0 and thus dP < 0. The fact that ice ﬂoats on water implies via the Clausius–Clapeyron equation that its melting point decreases on increasing the pressure (Fig. 3.29). Note: There are a few other substances which expand on melting, e.g. mercury and bismuth. The large volume increase of water on melting (9.1%) is related to the open structure of ice, containing voids (the bonding is due to the formation of hydrogen bonds between the oxygen atoms, cf. Fig. 3.30). Therefore, the liquid phase is more dense. Above 4◦ C above the melting point Tm , the density of water begins to decrease on cooling (water anomaly) since local ordering occurs already at temperatures above Tm . While as a rule a solid material sinks within its own liquid phase (melt), ice ﬂoats on water, in such a way that about 9/10 of the ice is under the surface of the water. This fact together with the density anomaly of water plays a very important role in Nature and is fundamental for the existence of life on the Earth. The volume change upon melting if ice is VL − VS = (1.00 − 1.091) cm3 /g = −0.091 cm3 g−1 . The latent heat of melting per g is Q = 80 cal/g = 80 × 42.7 atm cm3 /g. From this, it follows that the slope of the melting curve of ice near 0◦ C is dP 80 × 42.7 atm =− = −138 atm/K . dT 273 × 0.091 K

(3.8.12)

3.8 Phases of Single-Component Systems

137

Fig. 3.30. The hexagonal structure of ice. The oxygen atoms are shown; they are connected to four neighbors via hydrogen bonds The melting curve as a function of the temperature is very steep. It requires a pressure increase of 138 atm to lower the melting temperature by 1 K. This “freezingpoint depression”, small as it is, enters into a number of phenomena in daily life. If a piece of ice at somewhat below 0◦ C is placed under increased pressure, it at ﬁrst begins to melt. The necessary heat of melting is taken from the ice itself, and it therefore cools to a somewhat lower temperature, so that the melting process is interrupted as long as no more heat enters the ice from its surroundings. This is the so-called regelation of ice (= the alternating melting and freezing of ice caused by changes in its temperature and pressure). Pressing together snow, which consists of ice crystals, to make a snowball causes the snow to melt to a small extent due to the increased pressure. When the pressure is released, it freezes again, and the snow crystals are glued together. The slickness of ice is essentially due to the fact that it melts at places where it is under pressure, so that between a sliding object and the surface of the ice there is a thin layer of liquid water, which acts like a lubricant, explaining e.g. the gliding motion of an ice skater. Part of the plasticity of glacial ice and its slow motion, like that of a viscous liquid, are also due to regelation of the ice. The lower portions of the glacier become movable as a result of the pressure from the weight of the ice above, but they freeze again when the pressure is released.

(iii) 3 He, liquid → solid: the phase diagram of 3 He is shown schematically in Fig. 3.31. At low temperatures, there is an interval where the melting curve falls. In this region, in the transition from liquid to solid (see the arrow in Fig. 3.31a), dP dT < 0; furthermore, it is found experimentally that the volume of the solid phase is smaller than that of the liquid (as is the usual case), ∆V < 0. We thus ﬁnd from the Clausius–Clapeyron equation (3.8.8) ∆S > 0, as expected from the general considerations in Remark (ii). The Pomeranchuk eﬀect: The fact that within the temperature interval mentioned above, the entropy increases on solidiﬁcation is called the Pomeranchuk eﬀect. It is employed for the purpose of reaching low temperatures (see Fig. 3.31b). Compression (dashed line) of liquid 3 He leads to its solidiﬁcation and, because of ∆S > 0, to the uptake of heat. This causes a decrease in the temperature of the substance. Compression therefore causes the phase transition to proceed along the melting curve (see arrow in Fig. 3.31b).

138

3. Thermodynamics

Fig. 3.31. The phase diagram of 3 He. (a) Isobaric solidiﬁcation in the range where dP < 0. (b) Pomeranchuk eﬀect dT

This eﬀect can be used to cool 3 He; with it, temperatures down to 2 × 10−3 K can be attained. The Pomeranchuk eﬀect, however, has nearly no practical signiﬁcance in low-temperature physics today.The currently most important methods for obtaining low temperatures are 3 He-4 He dilution (2 × 10−3 − 5 × 10−3 K) and adiabatic demagnetization of copper nuclei (1.5 × 10−6 − 12 × 10−6 K), where the temperatures obtained are shown in parentheses. (iv) The sublimation curve: We consider a solid (1), which is in equilibrium with a classical, ideal gas (2). For the volumes of the two phases, we have V1 V2 ; then it follows from the Clausius–Clapeyron equation (3.8.11) that dP QL = , dT T V2 where QL represents the latent heat of sublimation. For V2 , we insert the ideal gas equation, dP QL P = . dT kN T 2

(3.8.13)

This diﬀerential equation can be immediately integrated under the assumption that QL is independent of temperature: P = P0 e−q/kT ,

(3.8.14)

where q = QNL is the heat of sublimation per particle. Equation (3.8.14) yields the shape of the sublimation curve under the assumptions used. The vapor pressure of most solid materials is rather small, and in fact in most cases, no observable decrease with time in the amount of these substances due to evaporation is detected. Only a very few solid materials exhibit a readily observable sublimation and have as a result a noticeable vapor pressure, which increases with increasing temperature; among them are some solid perfume substances. Numerical values for the vapor pressure over ice and iodine are given in Tables I.8 and I.9.

3.8 Phases of Single-Component Systems

139

At temperatures well below 0◦ C and in dry air, one can observe a gradual disappearance of snow, which is converted directly into water vapor by sublimation. The reverse phenomenon is the direct formation of frost from water vapor in the air, or the condensation of snow crystals in the cool upper layers of the atmosphere. If iodine crystals are introduced into an evacuated glass vessel and a spot on the glass wall is cooled, then solid iodine condenses from the iodine vapor which forms in the vessel. Iodine crystals which are left standing in the open air, napthalene crystals (“moth balls”), and certain mercury salts, including “sublimate” (HgCl2 ), among others, gradually vanish due to sublimation.

3.8.3 The Convexity of the Free Energy and the Concavity of the Free Enthalpy (Gibbs’ Free Energy) We now return again to the gas-liquid transition, in order to discuss some additional aspects of evaporation and the curvature of the thermodynamic potentials. The coexistence region and the coexistence curve are clearly visible in the T -V diagram. Instead, one often uses a P -V diagram. From the projection of the three-dimensional P -V -T diagram, we can see the shape drawn in Fig. 3.32. From the shape of the isotherms in the P -V diagram, the ∂F free energy can be determined analytically and graphically. Owing to ∂V T = −P , it follows for the free energy that

Fig. 3.32. The isotherms PT (V ) and the free energy as a function of the volume during evaporation; the thin line is the coexistence curve

140

3. Thermodynamics

Fig. 3.33. The determination of the free enthalpy from the free energy by construction

V F (T, V ) − F (T, V0 ) = −

dV PT (V ) .

(3.8.15)

V0

One immediately sees that the isotherms in Fig. 3.32 lead qualitatively to the volume dependence of the free energy which is drawn below. The free energy is convex (curved upwards). The fundamental cause of this is the fact that the compressibility is positive: ∂2F 1 ∂P ∝ =− >0, 2 ∂V ∂V κT while

∂2F ∂T 2

=−

V

∂S ∂T

∝ −CV < 0 .

(3.8.16)

V

These inequalities are based upon the stability relations proved previously, (3.3.5, 3.3.6), and (3.6.48a,b). The free enthalpy or Gibbs’ free energy ∂FG(T, P ) = F + P V can be constructed from F (T, V ). Due to P = − ∂V , G(T, P ) is obtained from T F (T, V ) by constructing a tangent to F (T, V ) with the slope −P (see Fig. 3.33). The intersection of this tangent with the ordinate has the coordinates

∂F F (T, V ) − V = F + V P = G(T, P ) . (3.8.17) ∂V T The result of this construction is drawn in Fig. 3.34. The derivatives of the free enthalpy

∂G ∂G = V and = −S ∂P T ∂T P yield the volume and the entropy. They are discontinuous at a phase transition, which results in a kink in the curves. Here, P0 (T ) is the evaporation

3.8 Phases of Single-Component Systems

141

Fig. 3.34. The free enthalpy (Gibbs’ free energy) as a function of (a) the pressure and (b) the temperature.

pressure at the temperature T , and T0 (P ) is the evaporation temperature at the pressure P . From this construction, one can also see that the free enthalpy is concave (Fig. 3.34). The curvatures are negative because κT > 0 and CP > 0. The signs of the slopes result from V > 0 and S > 0. It is also readily seen from the ﬁgures that the entropy increases as a result of a transition to a higher-temperature phase, and the volume decreases as a result of a transition to a higher-pressure phase. These consequences of the stability conditions hold quite generally. In the diagrams (3.34a,b), the terms gas and liquid phases could be replaced by low-pressure and high-pressure or high-temperature and low-temperature phases. On melting, the latent heat must be added to the system, on freezing (solidifying), it must be removed. When heat is put into or taken out of a system at constant pressure, it is employed to convert the solid phase to the liquid or vice versa. In the coexistence region, the temperature remains constant during these processes. This is the reason why in late Autumn and early Spring the temperature near the Earth remains close to zero degrees Celsius, the freezing point of water.

3.8.4 The Triple Point At the triple point (Figs. 3.26 and 3.35), the solid, liquid and gas phases coexist in equilibrium. The condition for equilibrium of the gaseous, liquid and solid phases, or more generally for three phases 1, 2 and 3, is: µ1 (T, P ) = µ2 (T, P ) = µ3 (T, P ) ,

(3.8.18)

and it determines the triple point pressure and the triple point temperature Pt , Tt . In the P -T diagram, the triple point is in fact a single point. In the T -V diagram it is represented by the horizontal line drawn in Fig. 3.35b. Along this line, the three phases are in equilibrium. If the phase diagram is

142

3. Thermodynamics

Fig. 3.35. The triple point (a) in a P -T diagram (the phases are denoted by 1, 2, 3. The coexistence regions are marked as 3-2 etc., i.e. denoting the coexistence of phase 3 and phase 2 on the two branches of the coexistence curve.); (b) in a T -v diagram; and (c) in a v-s diagram

represented in terms of two extensive variables, such as e.g. by V and S as in Fig. 3.35c, then the triple point becomes a triangular area as is visible in the ﬁgure. At each point on this triangle, the states of the three phases 1, 2, and 3 corresponding to the vertices of the triangle coexist with one another. We now want to describe this more precisely. Let s1 , s2 and s3 be the entropies in the phases 1, 2 and 3 just at the triple point, per particle ∂µi si = − ∂T , and correspondingly, v1 , v2 , v3 are the speciﬁc volumes P Tt ,Pt i vi = ∂µ ∂P T Tt ,Pt . The points (si , vi ) are shown in the s-v diagram as points 1, 2, 3. Clearly, every pair of phases can coexist with each other; the lines connecting the points 1 and 2 etc. yield the triangle with vertices 1, 2, and 3. The coexistence curves phases, e.g. 1 and2, are in the found of two ∂µi i s-v diagram from si (T ) = − ∂µ and v (T ) = with i ∂T ∂P P0 (T ) P0 (T ) P

T

i = 1 and 2 along with the associated phase-boundary curve P = P0 (T ). Here, the temperature is a parameter; points on the two branches of the coexistence curves with the same value of T can coexist with each other. The diagram in 3.35c is only schematic. The (by no means parallel) lines within the two-phase coexistence areas show which of the pairs of single-component states can coexist with each other on the two branches of the coexistence line. Now we turn to the interior of the triangular area in Fig. 3.35c. It is immediately clear that the three triple-point phases 1, 2, 3 can coexist with each other at the temperature Tt and pressure Pt in arbitrary quantities. This also means that a given amount of the substance can be distributed among these three phases in arbitrary fractions c1 , c2 , c3 (0 ≤ ci ≤ 1) c 1 + c2 + c3 = 1 ,

(3.8.19a)

and then will have the total speciﬁc entropy c1 s1 + c2 s2 + c3 s3 = s

(3.8.19b)

3.8 Phases of Single-Component Systems

143

and the total speciﬁc volume c1 v1 + c2 v2 + c3 v3 = v .

(3.8.19c)

From (3.8.19a,b,c), it follows that s and v lie within the triangle in Fig. 3.35c. Conversely, every (heterogeneous) equilibrium state with the total speciﬁc entropy s and speciﬁc volume v can exist within the triangle, where c1 , c2 , c3 follow from (3.8.19a–c). Eqns. (3.8.19a–c) can be interpreted by the following center-of-gravity rule: let a point (s, v) within the triangle in the v-s diagram (see Fig. 3.35c) be given. The fractions c1 , c2 , c3 must be chosen in such a way that attributing masses c1 , c2 , c3 to the vertices 1, 2, 3 of the triangle leads to a center of gravity at the position (s, v). This can be immediately understood if one writes (3.8.19b,c) in the two-component form:

v v v v . (3.8.20) c1 1 + c2 2 + c3 3 = s s1 s2 s3

Remarks: (i) Apart from the center-of-gravity rule, the linear equations can be solved algebraically: 1 1 1 1 1 1 1 1 1 s s2 s 3 s 1 s s3 s 1 s2 s v v2 v3 v1 v v3 v1 v2 v c1 = , c2 = 1 1 1 , c3 = 1 1 1 . 1 1 1 s1 s 2 s 3 s1 s2 s3 s1 s2 s3 v1 v2 v3 v1 v2 v3 v1 v2 v3 (ii) Making use of the triple point gives a precise standard for a temperature and a pressure, since the coexistence of the three phases can be veriﬁed without a doubt. From Fig. 3.35c, it can also be seen that the triple point is not a point as a function of the experimentally controllable parameters, but rather the whole area of the triangle. The parameters which can be directly varied from outside the system are not P and T , but rather the volume V and the entropy S, which can be varied by performing work on the system or by transferring heat to it. If heat is put into the system at the point marked by a cross (Fig. 3.35c), then in the example of water, some ice would melt, but the state would still remain within the triangle. This explains why the triple point is insensitive to changes within wide limits and is therefore very suitable as a temperature ﬁxed point. (iii) For water, Tt = 273.16 K and Pt = 4.58 Torr. As explained in Sect. 3.4, the absolute temperature scale is determined by the triple point of water. In order to reach the triple point, one simply needs to distill highly pure water

144

3. Thermodynamics

Fig. 3.36. A triple-point cell: ice, water, and water vapor are in equilibrium with each other. A freezing mixture in contact with the inner walls causes some water to freeze there. It is then replaced by the thermometer bulb, and a ﬁlm of liquid water forms on the inner wall

into a container and to seal it oﬀ after removing all the air. One then has water and water vapor in coexistence (coexistence region 1-2 in Fig. 3.35c). Removing heat by means of a freezing mixture brings the system into the triple-point range. As long as all three phases are present, the temperature equals Tt (see Fig. 3.36).

3.9 Equilibrium in Multicomponent Systems 3.9.1 Generalization of the Thermodynamic Potentials We consider a homogeneous mixture of n materials, or as one says in this connection, components, whose particle numbers are N1 , N2 , . . . , Nn . We ﬁrst need to generalize the thermodynamic relations to this situation. To this end, we refer to Chap. 2. Now, the phase-space volume and similarly the entropy are functions of the energy, the volume, and all of the particle numbers: S = S(E, V, N1 , . . . , Nn ) .

(3.9.1)

All the thermodynamic relations can be generalized to this case by replacing N and µ by Ni and µi and summing over i. We deﬁne the chemical potential of the ith material by

∂S µi = −T (3.9.2a) ∂Ni E,V,{Nk=i } and, as before,

∂S 1 = T ∂E V,{Nk }

and

P = T

∂S ∂V

. E,{Nk }

(3.9.2b,c)

3.9 Equilibrium in Multicomponent Systems

145

Then for the diﬀerential of the entropy, we ﬁnd µi P 1 dE + dV − dNi , T T T i=1 n

dS =

(3.9.3)

and from it the First Law dE = T dS − P dV +

n

µi dNi

(3.9.4)

i=1

for this mixture. Die Gibbs–Duhem relation for homogeneous mixtures reads E = TS − PV +

n

µi Ni .

(3.9.5)

i=1

It is obtained analogously to Sect. 3.1.3, by diﬀerentiating αE = E(αS, αV, αN1 , . . . , αNn )

(3.9.6)

with respect to α. From (3.9.4) and (3.9.5), we ﬁnd the diﬀerential form of the Gibbs–Duhem relation for mixtures −SdT + V dP −

n

Ni dµi = 0 .

(3.9.7)

i=1

It can be seen from this relation that of the n + 2 variables (T, P, µ1 , . . . , µn ), only n + 1 are independent. The free enthalpy (Gibbs’ free energy) is deﬁned by G = E − TS + PV .

(3.9.8)

From the First Law, (3.9.4), we obtain its diﬀerential form: dG = −SdT + V dP +

n

µi dNi .

(3.9.9)

i=1

From (3.9.9), we can read oﬀ

∂G ∂G , V = , S=− ∂T P,{Nk } ∂P T,{Nk }

µi =

∂G ∂Ni

. T,P,{Nk=i }

(3.9.10) For homogeneous mixtures, using the Gibbs–Duhem relation (3.9.5) we ﬁnd for the free enthalpy (3.9.8) G=

n i=1

µi Ni .

(3.9.11)

146

3. Thermodynamics

Then we have S=−

n

∂µi i=1

∂T

Ni ,

P

V =

n

∂µi i=1

∂P

Ni .

(3.9.12)

T

The chemical potentials are intensive quantities and therefore depend only on T , P and the n − 1 concentrations c1 = NN1 , . . . , cn−1 = Nn−1 (N = N n i=1 Ni , cn = 1 − c1 − . . . − cn−1 ). The grand canonical potential is deﬁned by Φ = E − TS −

n

µi Ni .

(3.9.13)

i=1

For its diﬀerential, we ﬁnd using the First Law (3.9.4) dΦ = −SdT − P dV −

n

Ni dµi .

(3.9.14)

i=1

For homogeneous mixtures, we obtain using the Gibbs–Duhem relation (3.9.5) Φ = −P V .

(3.9.15)

The density matrix for mixtures depends on the total Hamiltonian and will be introduced in Chap. 5. 3.9.2 Gibbs’ Phase Rule and Phase Equilibrium We consider n chemically diﬀerent materials (components), which can be in r phases (Fig. 3.37) and between which no chemical reactions are assumed to take place. The following equilibrium conditions hold: Temperature T and pressure P must have uniform values in the whole system. Furthermore, for each component i, the chemical potential must be the same in each of the phases. These equilibrium conditions can be derived directly by considering the microcanonical ensemble, or also from the stationarity of the entropy.

Fig. 3.37. Equilibrium between 3 phases

3.9 Equilibrium in Multicomponent Systems

147

(i) As a ﬁrst possibility, let us consider a microcanonical ensemble consisting of n chemical substances, and decompose it into r parts. Calculating the probability of a particular distribution of the energy, the volume and the particle numbers over these parts, one obtains for the most probable distribution the equality of the temperature, pressure and the chemical potentials of each component . (ii) As a second possibility for deriving the equilibrium conditions, one can start from the maximization of the entropy in equilibrium, (3.6.36b) dS ≥

n 1 dE + P dV − µi dNi , T i=1

(3.9.16)

and can then employ the resulting stationarity of the equilibrium state for ﬁxed E, V , and {Ni }, δS = 0

(3.9.17)

with respect to virtual variations. One can then proceed as in Sect. 3.6.5, decomposing a system into two parts 1 and 2, and varying not only the energy and the volume, but also the particle numbers [see (3.6.44)]:

∂S1 ∂S2 ∂S1 ∂S2 δS = δE1 + δV1 − − ∂E1 ∂E2 ∂V1 ∂V2

(3.9.18) ∂S1 ∂S2 δNi,1 + . . . + − . ∂Ni,1 ∂Ni,2 i Here, Ni,1 (Ni,2 ) is the particle number of component i in the subsystem 1 (2). From the condition of vanishing variation, the equality of the temperatures and pressures follow: T1 = T 2 ,

P1 = P2

and furthermore µi,1 = µi,2

∂S1 ∂Ni,1

=

(3.9.19) ∂S2 ∂Ni,2 ,

i.e. the equality of the chemical potentials

for i = 1, . . . , n .

(3.9.20)

We have thus now derived the equilibrium conditions formulated at the beginning of this section, and we wish to apply them to n chemical substances in r phases (Fig. 3.37). In particular, we want to ﬁnd out how many phases can coexist in equilibrium. Along with the equality of temperature and pressure in the whole system, from (3.9.20) the chemical potentials must also be equal, (1)

(r)

µ1 = . . . = µ1 , ... = . . . = µ(r) µ(1) n n .

(3.9.21)

148

3. Thermodynamics

The upper index refers to the phases, and the lower one to the components. Equations (3.9.21) represent all together n(r−1) conditions on the 2+(n−1)r (1) (1) (r) (r) variables (T, P, c1 , . . . , cn−1 , . . . , c1 , . . . , cn−1 ). The number of quantities which can be varied (i.e. the number of degrees of freedom is therefore equal to f = 2 + (n − 1)r − n(r − 1): f =2+n−r .

(3.9.22)

This relation (3.9.22) is called Gibbs’ phase rule . In this derivation we have assumed that each substance is present in all r phases. We can easily relax this assumption. If for example substance 1 (1) is not present in phase 1, then the condition on µ1 does not apply. The particle number of component 1 then also no longer occurs as a variable in phase 1. One thus has one condition and one variable less than before, and Gibbs’ phase rule (3.9.22) still applies.12 Examples of Applications of Gibbs’ Phase Rule: (i) For single-component system, n = 1: r = 1, f = 2 T, P free r = 2, f = 1 P = P0 (T ) Phase-boundary curve r = 3, f = 0 Fixed point: triple point. (ii) An example for a two-component system, n = 2, is a mixture of sal ammoniac and water, NH4 Cl+H2 O. The possible phases are: water vapor (it contains practically no NH4 Cl), the liquid mixture (solution), ice (containing some of the salt), the salt (containing some H2 O). Possible coexisting phases are: • liquid phase: r = 1, f = 3 (variables P, T, c) • liquid phase + water vapor: r = 2, f = 2, variables P, T ; the concentration is a function of P and T : c = c(P, T ). • liquid phase + water vapor + one solid phase: r = 3, f = 1. Only one variable, e.g. the temperature, is freely variable. • liquid phase + vapor + ice + salt: r = 4, f = 0. This is the eutectic point. The phase diagram of the liquid and the solid phases is shown in Fig. 3.38. At the concentration 0, the melting point of pure ice can be seen, and at c = 1, that of the pure salt. Since the freezing point of a solution is lowered (see Chap. 5), we can understand the shape of the two branches of the freezingpoint curve as a function of the concentration. The two branches meet at the eutectic point. In the regions ice-liq., ice and liquid, and in liq.-salt, liquid and salt coexist along the horizontal lines. The concentration of NH4 Cl in the ice 12

The number of degrees of freedom is a statement about the intensive variables; there are however also variations of the extensive variables. For example, at a triple point, f = 0, the entropy and the volume can vary within a triangle (Sect. 3.8.4).

3.9 Equilibrium in Multicomponent Systems

149

is considerably lower than in the liquid mixture which is in equilibrium with it. The solid phases often contain only the pure components; then the lefthand and the right-hand limiting lines are identical with the two vertical lines at c = 0 and c = 1. At the eutectic point, the liquid mixture is in equilibrium with the ice and with the salt. If the concentration of a liquid is less than that corresponding to the eutectic point, then ice forms on cooling the system. In this process, the concentration in the liquid increases until ﬁnally the eutectic concentration is reached, at which the liquid is converted to ice and salt. The resulting mixture of salt and ice crystals is called the eutectic. At the eutectic concentration, the liquid has its lowest freezing point.

Fig. 3.38. The phase diagram of a mixture of sal ammoniac (ammonium chloride) and water. In the horizontally shaded regions, ice and liquid, liquid and solid salt, and ﬁnally ice and solid salt coexist with each other.

The phase diagram in Fig. 3.38 for the liquid and solid phases and the corresponding interpretation using Gibbs’ phase rule can be applied to the following physical situations: (i) when the pressure is so low that also a gaseous phase (not shown) is present; (ii) without the gas phase at constant pressure P , in which case a degree of freedom is unavailable; or (iii) in the presence of air at the pressure P and vapor dissolved in it with the partial pressure cP .13 The concentration of the vapor c in the air enters the chemical potential as log cP (see Chap. 5). It adjusts itself in such a way that the chemical potential of the water vapor is equal to the chemical potential in the liquid mixture. It should be pointed out that owing to the term log c, the chemical potential of the vapor dissolved in the air is lower than that of the pure vapor. While at atmospheric pressure, boiling begins only at 100◦ C, and then the whole liquid phase is converted to vapor, here, even at very low temperatures a suﬃcient amount enters the vapor phase to permit the log c term to bring about the equalization of the chemical potentials. The action of freezing mixtures becomes clear from the phase diagram 3.38. For example, if NaCl and ice at a temperature of 0◦ C are brought together, then they are not in equilibrium. Some of the ice will melt, and the salt will dissolve in the resulting liquid water. Its concentration is to be sure much too high to be in equilibrium with the ice, so that more ice melts. In the melting process, 13

Gibbs’ phase rule is clearly still obeyed: compared to (ii), there is one component (air) more and also one more phase (air-vapor mixture) present.

150

3. Thermodynamics

heat is taken up, the entropy increases, and thus the temperature is lowered. This process continues until the temperature of the eutectic point has been reached. Then the ice, hydrated salt, NaCl·2H2 O, and liquid with the eutectic concentration are in equilibrium with each other. For NaCl and H2 O, the eutectic temperature is −21◦ C. The resulting mixture is termed a freezing mixture. It can be used to hold the temperature constant at −21◦ C. Uptake of heat does not lead to an increase of the temperature of the freezing mixture, but rather to continued melting of the ice and dissolution of NaCl at a constant temperature.

Eutectic mixtures always occur when there is a miscibility gap between the two solid phases and the free energy of the liquid mixture is lower than that of the two solid phases (see problem 3.28). the melting point of the eutectic mixture is then considerably lower than the melting points of the two solid phases (see Table I.10). 3.9.3 Chemical Reactions, Thermodynamic Equilibrium and the Law of Mass Action In this section we consider systems with several components, in which the particle numbers can change as a result of chemical reactions. We ﬁrst determine the general condition for chemical equilibrium and then investigate mixtures of ideal gases. 3.9.3.1 The Condition for Chemical Equilibrium Reaction equations, such as for example 2H2 + O2 2H2 O ,

(3.9.23)

can in general be written in the form n

νj Aj = 0 ,

(3.9.24)

j=1

where the Aj are the chemical symbols and the stoichiometric coeﬃcients νj are (small) integers, which indicate the participation of the components in the reaction. We will adopt the convention that left indicates positive and right negative. The reaction equation (3.9.24) contains neither any information about the concentrations at which the Aj are present in thermodynamic and chemical equilibrium at a given temperature and pressure, nor about the direction in which the reaction will proceed. The change in the Gibbs free energy (≡ free enthalpy) with particle number at ﬁxed temperature T and ﬁxed pressure P for single-phase systems is14 14

Chemical reactions in systems consisting of several phases are treated in M.W. Zemansky and R.H. Dittman, Heat and Thermodynamics, Mc Graw Hill, Auckland, Sixth Edition, 1987.

3.9 Equilibrium in Multicomponent Systems

dG =

n

µj dNj .

151

(3.9.25)

j=1

In equilibrium, the Nj must be determined in such a way that G remains stationary, n

µj dNj = 0 .

(3.9.26)

j=1

If an amount dM participates in the reaction, then dNj = νj dM . The condition of stationarity then requires n

µj νj = 0 .

(3.9.27)

j=1

For every chemical reaction that is possible in the system, a relation of this type holds. It suﬃces for a fundamental understanding to determine the chemical equilibrium for a single reaction. The chemical potentials µj (T, P ) depend not only on the pressure and the temperature, but also on the relative particle numbers (concentrations). The latter adjust themselves in such a way in chemical equilibrium that (3.9.27) is fulﬁlled. In the case that substances which can react chemically are in thermal equilibrium, but not in chemical equilibrium, then from the change in Gibbs’ free energy, δG = δ µj (T, P )νj M (3.9.25 ) j

we can determine the direction which the reaction will take. Since G is a minimum at equilibrium, we must have δG ≤ 0; cf. Eq. (3.6.38b). The chemical composition is shifted towards the direction of smaller free enthalpy or lower chemical potentials. Remarks: (i) The condition for chemical equilibrium (3.9.27) can be interpreted to mean that the chemical potential of a compound is equal to the sum of the chemical potentials of its constituents. (ii) The equilibrium condition (3.9.27) for the reaction (3.9.24) holds also when the system consists of several phases which are in contact with each other and between which the reactants can pass. This is shown by the equality of the chemical potential of each component in all of the phases which are in equilibrium with each other. (iii) Eq. (3.9.27) can also be used to determine the equilibrium distribution of elementary particles which are transformed into one another by reactions.

152

3. Thermodynamics

For example, the distribution of electrons and positrons which are subject to pair annihilation, e− + e+ γ, can be found (see problem 3.31). These applications of statistical mechanics are important in cosmology, in the description of the early stages of the Universe, and for the equilibria of elementaryparticle reactions in stars. 3.9.3.2 Mixtures of Ideal Gases To continue the evaluation of the equilibrium condition (3.9.27), we require information about the chemical potentials. In the following, we consider reactions in (classical) ideal gases. In Sect. 5.2, we show that the chemical potential of particles of type j in a mixture of ideal molecular gases can be written in the form µj = fj (T ) + kT log cj P ,

(3.9.28a)

N

where cj = Nj holds and N is the total number of particles. The function fj (T ) depends solely on temperature and contains the microscopic parameters of the gas of type j. From (3.9.27) and (3.9.28a), it follows that eνj [fj (T )/kT +log(cj P )] = 1 . (3.9.29) j

According to Sect. 5.2, Eq. (5.2.4 ) is valid: fj (T ) = ε0el,j − cP,j T log kT − kT ζj .

(3.9.28b)

Inserting (3.9.28b) into (3.9.29) yields the product of the powers of the concentrations:

ν

cj j = K(T, P ) ≡ e

P j

νj (ζj −

ε0 el,j kT

)

P

(kT )

j

cP,j νj /k

P−

P j

νj

;

(3.9.30)

j

where ε0el,j is the electronic energy, cP,j the speciﬁc heat of component j at constant pressure, and ζj is the chemical constant 3/2

ζj = log

2mj

kΘr,j (2π2 )3/2

.

(3.9.31)

Here, we have assumed that Θr T Θv , with Θr and Θv the characteristic temperatures for the rotational and vibrational degrees of freedom, Eqs. (5.1.11) and (5.1.17). Equation (3.9.30) is the law of mass action for the concentrations. The function $ ν K(T, P ) is also termed the mass action constant. The statement that j cj j is a function of only T and P holds generally for ideal mixtures µj (T, P, {ci }) = µj (T, P, cj = 1, ci = 0(i = j)) + kT log cj .

3.9 Equilibrium in Multicomponent Systems

153

If, instead of the concentrations, we introduce the partial pressures (see remark (i) at the end of this section) Pj = cj P ,

(3.9.32)

then we obtain

P ν Pj j

= KP (T ) ≡ e

j

„ « ε0 el,j νj ζj − kT

P

(kT )

j

cP,j νj /k

,

(3.9.30 )

j

the law of mass action of Guldberg and Waage15 for the partial pressures, with KP (T ) independent of P . We now ﬁnd e.g. for the hydrogen-oxygen reaction of Eq. (3.9.23) 2H2 + O2 − 2H2 O = 0 , with νH2 = 2 ,

νO2 = 1 ,

νH2 O = −2 ,

(3.9.33)

the relation K(T, P ) =

P [H2 ]2 [O2 ] = const. e−q/kT T j cP,j νj /k P −1 . 2 [H2 O]

(3.9.34)

Here, the concentrations cj = [Aj ] are represented by the corresponding chemical symbols in square brackets, and we have used q = 2ε0H2 + ε0O2 − 2ε0H2 O > 0 , the heat of reaction at absolute zero, which is positive for the oxidation of hydrogen. The degree of dissociation α is deﬁned in terms of the concentrations: [H2 O] = 1 − α ,

[O2 ] =

α , 2

[H2 ] = α .

It then follows from (3.9.32) that P α3 −q/kT j cP,j νj /k P −1 , ∼ e T 2(1 − α)2

(3.9.35)

from which we can calculate α; α decreases exponentially with falling temperature. 15

The law of mass action was stated by Guldberg and Waage in 1867 on the basis of statistical considerations of reaction probabilities, and was later proved thermodynamically for ideal gases by Gibbs, who made it more speciﬁc through the calculation of K(T, P ).

154

3. Thermodynamics

The law of mass action makes important statements about the conditions under which the desired reactions can take place with optimum yields. It may be necessary to employ a catalyst in order to shorten the reaction time; however, what the equilibrium distribution of the reacting components will be is determined simply by the reaction equation and the chemical potentials of the constituents (components) – in the case of ideal gases, by Eq. (3.9.30). The law of mass action has many applications in chemistry and technology. As just one example, we consider here the pressure dependence of the reaction equilibrium. From (3.9.30), it follows that the pressure derivative of K(T, P ) is given by 1 ∂K ∂ log K 1 = =− νi , K ∂P ∂P P i

(3.9.36a)

where ν = i νi is the so called molar excess. From the equation of state of mixtures of ideal gases (Eq. (5.2.3)), P V = kT i Ni , we obtain for the changes ∆V and ∆N which accompany a reaction at constant T and P : P ∆V = kT

∆Ni .

(3.9.37a)

i

Let the number of individual reactions be ∆N , i.e. ∆Ni = νi ∆N , then it follows from (3.9.37a) that −

1 ∆V . νi = − P i kT ∆N

(3.9.37b)

Taking ∆N = L (the Loschmidt/Avagadro number), then νi moles of each component will react and it follows from (3.9.36a) and (3.9.37b) with the gas constant R that 1 ∂K ∆V =− . (3.9.36b) K ∂P RT Furthermore, ∆V = i νi Vmol is the volume change in the course of the reaction proceeding from right to left (for a reaction which is represented in the form (3.9.23)). (The value of the molar volume Vmol is the same for every ideal gas.) According to Eq. (3.9.36b) in connection with (3.9.30), a larger value of K leads to an increase in the concentrations cj with positive νj , i.e. of those substances which are on the left-hand side of the reaction equation. Therefore, from (3.9.36b), a pressure increase leads to a shift of the equilibrium towards the side of the reaction equation corresponding to the smaller volume. When ∆V = 0, the position of the equilibrium depends only upon the temperature, e.g. in the hydrogen chloride reaction H2 + Cl2 2HCl.

3.9 Equilibrium in Multicomponent Systems

155

In a similar manner, one ﬁnds for the temperature dependence of K(T, P ) the result νi hi ∂ log K ∆h = i 2 = . (3.9.38) ∂T RT RT 2 Here, hi is the molar enthalpy of the substance i and ∆h is the change of the overall molar enthalpy when the reaction runs its course one time from right to left in the reaction equation, c.f. problem 3.26. An interesting and technically important application is Haber’s synthesis of ammonia from nitrogen and hydrogen gas: the chemical reaction N2 + 3H2 2NH3

(3.9.39)

is characterized by 1N2 + 3H2 − 2NH3 0 (ν = cN2 c3H2 = K(T, P ) = KP (T ) P −2 . c2NH3

i

νi = 2): (3.9.40)

To obtain a high yield of NH3 , the pressure must be made as high as possible. Sommerfeld:16 “The extraordinary success with which this synthesis is now carried out in industry is due to the complete understanding of the conditions for thermodynamic equilibrium (Haber), to the mastery of the engineering problems connected with high pressure (Bosch), and, ﬁnally, to the successful selection of catalyzers which promote high reaction rates (Mittasch).” Remarks: (i) The partial pressures introduced in Eq. (3.9.32), Pj = cj P , with cj = Nj /N , in accord with the equation of state of a mixture of ideal gases, Eq. (5.2.3), obey the equations V Pj = Nj k T and P = Pi . (3.9.41) i

(This fact is known as Dalton’s Law: the non-interacting gases in the mixture produce partial pressures corresponding to their particle numbers, as if they each occupy the entire available volume.) (ii) Frequently, the law of mass action is expressed in terms of the particle densities ρi = Ni /V : P ρνi i = Kρ (T ) ≡ (kT ) i νi KP (T ) . (3.9.30 ) i

16

A. Sommerfeld, Thermodynamics and Statistical Mechanics: Lectures on Theoretical Physics, Vol. V (Academic Press, New York, 1956), p. 86

156

3. Thermodynamics

(iii) Now we turn to the direction which a reaction will take. If a mixture is initially present with arbitrary densities, the direction in which the reaction will proceed can be read oﬀ the law of mass action. Let ν1 , ν2 , . . . , νs be positive and νs+1 , νs+2 , . . . , νn negative, so that the reaction equation (3.9.24) takes on the form n

νi Ai

|νi |Ai ,

(3.9.24 )

i=s+1

Assume that the product of the particle densities obeys the inequality i

s $

ρνi i

≡

i=1 n $

ρνi i

i=s+1

|νi |

< Kρ (T ) ,

(3.9.42)

ρi

i.e. the system is not in chemical equilibrium. If the chemical reaction proceeds from right to left, the densities on the left will increase, and the fraction in the inequality will become larger. Therefore, in the case (3.9.42), the reaction will proceed from right to left. If, in contrast, the inequality was initially reversed, with a > sign, then the reaction would proceed from left to right. (iv) All chemical reactions exhibit a heat of reaction, i.e. they are accompanied either by heat release (exothermic reactions) or by taking up of heat (endothermic reactions). We recall that for isobaric processes, ∆Q = ∆H, and the heat of reaction is equal to the change in the enthalpy; see the comment following Eq. (3.1.12). The temperature dependence of the reaction equilibrium follows from Eq. (3.9.38). A temperature increase at constant pressure shifts the equilibrium towards the side of the reaction equation where the enthalpy is higher; or, expressed diﬀerently, it leads to a reaction in the direction in which heat is taken up. As a rule, the electronic contribution O (eV) dominates. Thus, at low temperatures, the enthalpy-rich side is practically not present. ∗

3.9.4 Vapor-pressure Increase by Other Gases and by Surface Tension 3.9.4.1 The Evaporation of Water in Air As discussed in detail in Sect. 3.8.1, a single-component system can evaporate only along its vapor-pressure curve P0 (T ), or, stated diﬀerently, only along the vapor-pressure curve are the gaseous and the liquid phases in equilibrium. If an additional gas is present, this means that there is one more degree of freedom in Gibbs’ phase rule, so that a liquid can coexist with its vapor even outside of P0 (T ). Here, we wish to investigate evaporation in the presence of additional gases and in particular that of water under an air atmosphere. To this end

3.9 Equilibrium in Multicomponent Systems

157

we assume that the other gas is dissolved in the liquid phase to only a negligible extent. If the chemical potential of the liquid were independent of the pressure, then the other gas would have no inﬂuence at all on the chemical potential of the liquid; the partial pressure of the vapor would then have to be identical with the vapor pressure of the pure substance – a statement which is frequently made. In fact, the total pressure acts on the liquid, which changes its chemical potential. The resulting increase of the vapor pressure will be calculated here. To begin, we note that

∂µL V (3.9.43) = ∂P T N V is small, owing to the small speciﬁc volume vL = N of the liquid. When the pressure is changed by ∆P , the chemical potential of the liquid changes according to

µL (T, P + ∆P ) = µL (T, P ) + vL ∆P + O(∆P 2 ) .

(3.9.44)

From the Gibbs–Duhem relation, the chemical potential of the liquid is µL = eL − T sL + P vL .

(3.9.45)

Here, eL and sL refer to the internal energy and the entropy per particle. When we can neglect the temperature and pressure dependence of eL , sL , and vL , then (3.9.44) is valid with no further corrections. The chemical potential of the vapor, assuming an ideal mixture17 , is µvapor (T, P ) = µ0 (T ) + kT log cP ,

(3.9.46) N

where c is the concentration of the vapor in the gas phase, c = Nothervapor +Nvapor . The vapor-pressure curve P0 (T ) without additional gases follows from µL (T, P0 ) = µ0 (T ) + kT log P0 .

(3.9.47)

With an additional gas, the pressure is composed of the pressure of the other gas Pother and the partial pressure of the vapor, Pvapor = cP ; all together, P = Pother + Pvapor . Then the equality of the chemical potentials in the liquid and the gaseous phases is expressed by µL (T, Pother + Pvapor ) = µ0 (T ) + kT log Pvapor . Subtracting (3.9.47) from this, we ﬁnd 17

See Sect. 5.2

158

3. Thermodynamics

Pvapor − P0 +1 P0 Pvapor − P0 vL (Pother + Pvapor − P0 ) ≈ kT P0

µL (T, Pother + Pvapor ) − µL (T, P0 ) = kT log

kT vL Pother = − vL (Pvapor − P0 ) P0 vL Pother vL Pvapor − P0 = = (P − Pvapor ) . vG − vL vG − vL

(3.9.48)

From the second term in Eq. 3.9.48, it follows that the increase in vapor pressure is given approximately by Pvapor − P0 (T ) ≈ vvGL Pother , and the exact expression is found to be Pvapor = P0 (T ) +

vL (P − P0 (T )) . vG

(3.9.49)

The partial pressure of the vapor is increased relative to the vapor-pressure curve by vvGL × (P − P0 (T )). Due to the smallness of the factor vvGL , the partial pressure is still to a good approximation the same as the vapor pressure at the temperature T . The most important result of these considerations is the following: while a liquid under the pressure P at the temperature T is in equilibrium with its vapor phase only for P = P0 (T ); that is, for P > P0 (T ) (or at temperatures below its boiling point) it exists only in liquid form, it is also in equilibrium in this region of (P, T ) with its vapor when dissolved in another gas. We now discuss the evaporation of water or the sublimation of ice under an atmosphere of air, see Fig. 3.39. The atmosphere predetermines a particular pressure P . At each temperature T below the evaporation temperature determined by this pressure (P > P0 (T )), just enough water evaporates to make its partial pressure equal that given by (3.9.49) (recall Pvapor = cP ). The concentration of the water L (P − P0 (T )))/P . vapor is c = (P0 (T ) + vvG In a free air atmosphere, the water vapor is transported away by diﬀusion or by convection (wind), and more and more water must evaporate (vaporize).18 On

Fig. 3.39. The vapor pressure Pvapor lies above the vapor-pressure curve P0 (T ) (dot-dashed curve) 18

As already mentioned, the above considerations are also applicable to sublimation. When one cools water at 1 atm below 0◦ C, it freezes to ice. This ice at

3.9 Equilibrium in Multicomponent Systems

159

increasing the temperature, the partial pressure of the water increases, until ﬁnally it is equal to P . The vaporization which then results is called boiling. For P = P0 (T ), the liquid is in equilibrium with its pure vapor. Evaporation then occurs not only at the liquid surface, but also within the liquid, in particular at the walls of its container. There, bubbles of vapor are formed, which then rise to the surface. Within these vapor bubbles, the vapor pressure is P0 (T ), corresponding to the temperature T . Since the vapor bubbles within the liquid are also subject to the hydrostatic pressure of the liquid, their temperature must in fact be somewhat higher than the boiling point under atmospheric pressure. If the liquid contains nucleation centers (such as the fat globules in milk), at which vapor bubbles can form more readily than in the pure liquid, then it will “boil over”. The increase in the vapor pressure by increased external pressure, or as one might say, by ‘pressing on it’, may seem surprising. The additional pressure causes an increase in the release of molecules from the liquid, i.e. an increase in the partial pressure.

3.9.4.2 Vapor-Pressure Increase by Surface Tension of Droplets A further additional pressure is due to the surface tension and plays a role in the evaporation of liquid droplets. We consider a liquid droplet of radius r. When the radius is increased isothermally by an amount dr, the surface area increases by 8πr dr, which leads to an energy increase of σ8πr dr, where σ is the surface tension. Owing to the pressure diﬀerence p between the pressure within the droplet and the pressure of the surrounding atmosphere, there is a force p 4πr2 which acts outwardly on the surface. The total change of the free energy is therefore dF = δA = σ8πr dr − p 4πr2 dr .

(3.9.50)

In equilibrium, the free energy of the droplet must be stationary, so that for the pressure diﬀerence we ﬁnd the following dependence on the radius: p=

2σ . r

(3.9.51)

Thus, small droplets have a higher vapor pressure than larger one. The vaporpressure increase due to the surface tension is from Eq. (3.9.48) now seen to be Pvapor − P0 (T ) =

2σ vL r vG − vL

(3.9.52)

inversely proportional to the radius of the droplet. In a mixture of small and large droplets, the smaller ones are therefore consumed by the larger ones. e.g. −10◦ C is to be sure as a single-component system not in equilibrium with the gas phase, but rather with the water vapor in the atmosphere at a partial pressure of about P0 (−10◦ C), where P0 (T ) represents the sublimation curve. For this reason, frozen laundry dries, because ice sublimes in the atmosphere.

160

3. Thermodynamics

Remarks: (i) Small droplets evaporate more readily than liquids with a ﬂat surface, and conversely condensation occurs less easily on small droplets. This is the reason why extended solid cooled surfaces promote the condensation of water vapor more readily than small droplets do. The temperature at which the condensation of water from the atmosphere onto extended surfaces (dew formation) takes place is called the dew point. It depends on the partial pressure of water vapor in the air, i.e. its degree of saturation, and can be used to determine the amount of moisture in the air. (ii) We consider the homogeneous condensation of a gas in free space without surfaces. The temperature of the gas is taken to be T and the vapor pressure at this temperature to be P0 (T ). We assume that the pressure P of the gas is greater than the vapor pressure; it is then referred to as supersaturated vapor. For each degree of supersaturation, then, a critical radius can be deﬁned from (3.9.52): rcr =

2σ vL . vG (P − P0 (T ))

For droplets whose radius is smaller than rcr the vapor is not supersaturated. Condensation can therefore not take place through the formation of very small droplets, since their vapor pressures would be higher than P . Some critical droplets must be formed through ﬂuctuations in order that condensation can be initiated. Condensation is favored by additional attractive forces; for example, in the air, there are always electrically-charged dust particles and other impurities present, which as a result of their electrical forces promote condensation, i.e. they act as nucleation centers for condensation.

Problems for Chapter 3 3.1 Read oﬀ the partial derivatives of the internal energy E with respect to its natural variables from Eq. (3.1.3). 3.2 Show that x δg = αdx + β dy y is not an exact diﬀerential: a) using the integrability conditions and b) by integration from P1 to P2 along the paths C1 and C2 . Show that 1/x is an integrating factor, df = δg/x.

3.3 Prove the chain rule (3.2.13) for Jacobians. 3.4 Derive the following relations: CP κT = CV κS

„

,

∂T ∂V

«

=− S

T CV

„

∂P ∂T

«

„ and V

∂T ∂P

« = S

T CP

„

∂V ∂T

« . P

Problems for Chapter 3

161

Fig. 3.40. Paths in the x-y diagram

3.5 Determine the work performed by an ideal gas, W (V ) =

RV

dV P during a V1 reversible adiabatic expansion. From δQ = 0, it follows that dE = −P dV , and from ` ´2/3 and this the adiabatic equations for an ideal gas can be obtained: T = T1 VV1 V

2/3

P = N kT1 V15/3 . They can be used to determine the work performed.

3.6 Show that the stability conditions (3.6.48a,b) follow from the maximalization of the entropy. 3.7 One liter of an ideal gas expands reversibly and isothermally at (20◦ C) from an initial pressure of 20 atm to 1 atm. How large is the work performed in Joules? What quantity of heat Q in calories must be transferred to the gas?

3.8 Show that the ratio of the entropy increase on heating of an ideal gas from T1 to T2 at constant pressure to that at constant volume is given by the ratio of the speciﬁc heats.

3.9 A thermally insulated system is supposed to consist of 2 subsystems (TA , VA , P ) and (TB , VB , P ), which are separated by a movable, diathermal piston (Fig. 3.41(a). The gases are ideal. (a) Calculate the entropy change accompanying equalization of the temperatures (irreversible process). (b) Calculate the work performed in a quasistatic temperature equalization; cf. Fig. 3.41(b).

(a)

(b)

Fig. 3.41. For problem 3.9

3.10 Calculate the work obtained, W =

H

P dV , in a Carnot cycle using an ideal

gas, by evaluating the ring integral.

3.11 Compare the cooling eﬃciency of a Carnot cycle between the temperatures T1 and T2 with that of two Carnot cycles operating between T1 and T3 and between T3 and T2 (T1 < T3 < T2 ). Show that it is more favorable to decompose a cooling process into several smaller steps.

162

3. Thermodynamics

3.12 Discuss a Carnot cycle in which the working ‘substance’ is thermal radiation. For this case, the following relations hold: E = σV T 4 , pV = 13 E, σ > 0. (a) Derive the adiabatic equation. (b) Compute CV and CP .

3.13 Calculate the eﬃciency of the Joule cycle (see Fig. 3.42): Result : η = 1 − (P2 /P1 )(κ−1)/κ . Compare this eﬃciency with that of the Carnot cycle (drawn in dashed lines), using an ideal gas as working substance.

Fig. 3.42. The Joule cycle

3.14 Calculate the eﬃciency of the Diesel cycle (Fig. 3.43) Result: η =1−

1 (V2 /V1 )κ − (V3 /V1 )κ . κ (V2 /V1 ) − (V3 /V1 )

Fig. 3.43. The Diesel cycle

3.15 Calculate for an ideal gas the change in the internal energy, the work performed, and the quantity of heat transferred for the quasistatic processes along the following paths from 1 to 2 (see Fig. 3.44) (a) 1-A-2 (b) 1-B-2 (c) 1-C-2. What is the shape of the E(P, V ) surface?

Fig. 3.44. For problem 3.15

Problems for Chapter 3

163

3.16 Consider the socalled Stirling cycle, where a heat engine (with an ideal gas as working substance) performs work according to the following quasistatic cycle: (a) isothermal expansion at the temperature T1 from a volume V1 to a volume V2 . (b) cooling at constant volume V2 from T1 to T2 . (c) isothermal compression at the temperature T2 from V2 to V1 . (d) heating at constant volume from T2 to T1 . Determine the thermal eﬃciency η of this process! 3.17 The ratio of the speciﬁc volume of water to that of ice is 1.000:1.091 at 0◦ C and 1 atm. The heat of melting is 80 cal/g. Calculate the slope of the melting curve. 3.18 Integrate the Clausius–Clapeyron diﬀerential equation for the transition liquid-gas, by making the simplifying assumption that the heat of transition is constant, Vliquid can be neglected in comparison to Vgas , and that the equation of state for ideal gases is applicable to the gas phase. 3.19 Consider the neighborhood of the triple point in a region where the limiting curves can be approximated as straight lines. Show that α < π holds (see Fig. 3.45). Hint: Use dP/dT = ∆S/∆V , and the fact that the slope of line 2 is greater than that of line 3.

Fig. 3.45. The vicinity of a triple point

3.20 The latent heat of ice per unit mass is QL . A container holds a mixture of water and ice at the freezing point (absolute temperature T0 ). An additional amount of the water in the container (of mass m) is to be frozen using a cooling apparatus. The heat output from the cooling apparatus is used to heat a body of heat capacity C and initial temperature T0 . What is the minimum quantity of heat energy transferred from the apparatus to the body? (Assume C to be temperature independent). 3.21 (a) Discuss the pressure dependence of the reaction N2 +3H2 2NH3 (ammonia synthesis). At what pressure is the yield of ammonia greatest? (b) Discuss the thermal dissociation 2H2 O 2H2 +O2 . Show that an increase in pressure works against the dissociation. 3.22 Give the details of the derivation of Eqs. (3.9.36a) and (3.9.36b). 3.23 Discuss the pressure and temperature dependence of the reaction CO + 3H2 CH4 + H2 O .

3.24 Apply the law of mass action to the reaction H2 +Cl2 2HCl.

164

3. Thermodynamics

3.25 Derive the law of mass action for the particle densities ρj = Nj /V

(Eq. (3.9.30 )) .

3.26 Prove Eq. (3.9.38) for the temperature dependence of the mass-action constant. `G´ ∂ = T 2 ∂T Hint: Show that H = G − T ∂G ∂T T and express the change in the free enthalpy X µi νi ∆G = i

using Eq. (3.9.28), then insert the law of mass action (3.9.30) or (3.9.30’).

3.27 The Pomeranchuk eﬀect. The entropy diagram for solid and liquid He3 has the shape shown below 3 K. Note that the speciﬁc volumes of both phases do not change within this temperature range. Draw P (T ) for the coexistence curves of the phases.

Fig. 3.46. The Pomeranchuk eﬀect

3.28 The (speciﬁc) free energies fα and fβ of two solid phases α and β with a miscibility gap and the (speciﬁc) free energy fL of the liquid mixture are shown as functions of the concentration c in Fig. 3.47. Discuss the meaning of the dashed and solid double tangents. On lowering the temperature, the free energy of the liquid phase is increased, i.e. fL is shifted upwards relative to the two ﬁxed branches of the free energy. Derive from this the shape of the eutectic phase diagram.

Fig. 3.47. Liquid mixture

Problems for Chapter 3

165

3.29 A typical shape for the phase diagram of liquid and gaseous mixtures is shown in Fig. 3.48. The components A and B are completely miscible in both the gas phase and the liquid phase. B has a higher boiling point than A. At a temperature in the interval TA < T < TB , the gas phase is therefore richer in A than the liquid phase. Discuss the boiling process for the initial concentration c0 (a) in the case that the liquid remains in contact with the gas phase: show that vaporization takes place in the temperature interval T0 to Te . (b) in the case that the vapor is pumped oﬀ: show that the vaporization takes place in the interval T0 to TB .

Fig. 3.48. Bubble point and dew point lines

Remark: The curve which is made by the boiling curve (evaporation limit) and the condensation curve together form the bubble point and dew point lines, a lensshaped closed curve. Its shape is of decisive importance for the eﬃciency of distillation processes. This ‘boiling lens’ can also take on much more complex shapes than in Fig. 3.48, such as e.g. that shown in Fig. 3.49. A mixture with the concentration ca is called azeotropic. For this concentration, the evaporation of the mixture occurs exactly at the temperature Ta and not in a temperature interval. The eutectic concentration is also special in this sense. Such a point occurs in an alcohol-water mixture at 96%, which limits the distillation of alcohol.19

Fig. 3.49. Bubble point and dew point lines 19

Detailed information about phase diagrams of mixtures can be found in M. Hansen, Constitution of Binary Alloys, McGraw Hill, 1983 und its supplements. Further detailed discussions of the shape of phase diagrams are to be found in L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics, Vol. V, Statistical Physics, Pergamon Press 1980.

166

3. Thermodynamics

3.30 The free energy of the liquid phase, fL , is drawn in Fig. (3.50) as a function of the concentration, as well as that of the gas phase, fG . It is assumed that fL is temperature independent and fG shifts upwards with decreasing temperature (Fig. 3.50). Explain the occurrence of the ‘boiling lens’ in problem 3.29.

Fig. 3.50. Free energy

3.31 Consider the production of electron-positron pairs, e+ + e− γ . Assume for simplicity that the chemical potential of the electrons and positrons is given in the nonrelativistic limit, taking the rest energy into account, by µ = 3 mc2 + kT log λVN : Show that for the particle number densities n± of e± that n+ n− = λ−6 e−

2mc2 kT

holds and discuss the consequences.

3.32 Consider the boiling and condensation curves of a two-component liquid mixture. Take the concentrations in the gaseous and liquid phases to be cG and cL . Show that at the points where cG = cL (the azeotropic mixture) i.e. where the boiling and condensation curves come together, for a ﬁxed pressure P the following relation holds: dT =0, dc and for ﬁxed T dP =0, dc thus the slopes are horizontal. Method: Start from the diﬀerential Gibbs-Duhem relations for the gas and the liquid phases along the limiting curves.

Problems for Chapter 3

167

3.33 Determine the temperature of the atmosphere as a function of altitude. How much does the temperature decrease per km of altitude? Compare your result for the pressure P (z) with the barometric formula (see problem 2.15). Method: Start with the force balance on a small volume of air. That gives dP (z) = −mg P (z)/k · T (z) . dz Assume that the temperature changes depend on the pressure changes of the air dP (z) (z) = γ−1 . From this, one ﬁnds dTdz(z) . Numerical (ideal gas) adiabatically dT T (z) γ P values: m = 29 g/mole, γ = 1.41.

3.34 In meteorology, the concept of a “homogeneous atmosphere” is used, where ρ is taken to be constant. Determine the pressure and the temperature in such an atmosphere as functions of the altitude. Calculate the entropy of the homogeneous atmosphere and compare it with that of an isothermal atmosphere with the same energy content. Could such a homogeneous atmosphere be stable?

4. Ideal Quantum Gases

In this chapter, we want to derive the thermodynamic properties of ideal quantum gases, i.e. non-interacting particles, on the basis of quantum statistics. This includes nonrelativistic fermions and bosons whose interactions may be neglected, quasiparticles in condensed matter, and relativistic quanta, in particular photons.

4.1 The Grand Potential The calculation of the grand potential is found to be the most expedient way to proceed. In order to have a concrete system in mind, we start from the Hamiltonian for N non-interacting, nonrelativistic particles, H=

N 1 2 p . 2m i i=1

(4.1.1)

We assume the particles to be enclosed in a cube of edge length L and volume V = L3 , and apply periodic boundary conditions. The single-particle eigenfunctions of the Hamiltonian are then the momentum eigenstates |p and are given in real space by 1 ϕp (x) = x|p = √ eip·x/ , V

(4.1.2a)

where the momentum quantum numbers can take on the values p=

2π (ν1 , ν2 , ν3 ) , L

να = 0, ±1, . . . ,

(4.1.2b)

and the single-particle kinetic energy is given by εp =

p2 . 2m

(4.1.2c)

For the complete characterization of the single-particle states, we must still take the spin s into account. It is integral for bosons and half-integral for fermions. The quantum number ms for the z-component of the spins has

170

4. Ideal Quantum Gases

2s+1 possible values. We combine the two quantum numbers into one symbol, p ≡ (p, ms ) and ﬁnd for the complete energy eigenstates |p ≡ |p |ms .

(4.1.2d)

In the treatment which follows, we could start from arbitrary noninteracting Hamiltonians, which can also contain a potential and can depend on the spin, as is the case for electrons in a magnetic ﬁeld. We then still denote the single-particle quantum numbers by p and the eigenvalue belonging to the energy eigenstate |p by εp , but it need no longer be the same as (4.1.2c). These states form the basis of the N -particle states for bosons and fermions: |p1 , p2 , . . . , pN = N (±1)P P |p1 . . . |pN . (4.1.3) P

Here, the sum runs over all the permutations P of the numbers 1 to N . The upper sign holds for bosons, (+1)P = 1, the lower sign for fermions. (−1)P is equal to 1 for even permutations and −1 for odd permutations. The bosonic states are completely symmetric, the fermionic states are completely antisymmetric. As a result of the symmetrization operation, the state (4.1.3) is completely characterized by its occupation numbers np , which indicate how many of the N particles are in the state |p. For bosons, np = 0, 1, 2, . . . can assume all integer values from 0 to ∞. These particles are said to obey Bose– Einstein statistics. For fermions, each single-particle state can be occupied at most only once, np = 0, 1 (identical quantum numbers would yield zero due to the antisymmetrization on the right-hand side of (4.1.3)). Such particles are said to obey Fermi–Dirac statistics. The normalization factor in (4.1.3) is N = √1N ! for fermions and N = (N ! np1 ! np2 ! . . .)−1/2 for bosons.1 For an N -particle state, the sum of all the np obeys N= np , (4.1.4) p

and the energy eigenvalue of this N -particle state is np ε p . E({np }) =

(4.1.5)

p

We can now readily calculate the grand partition function (Sect. 2.7.2): 1

Note: for bosons, the state (4.1.3) can also be written in the form P (N !/np1 ! np2 ! . . .)−1/2 P P |p1 . . . |pN , where the sum includes only those permutations P which lead to diﬀerent terms.

4.1 The Grand Potential

ZG ≡

∞

e−β(E({np })−µN ) =

N =0 P {np } p np =N

=

p

e−β(εp −µ)np

np

e−β

171

P

p (εp −µ)np

{np }

⎧ 1 ⎪ for bosons ⎪ ⎨ −β(ε p −µ) 1−e p (4.1.6) = ⎪ ⎪ 1 + e−β(εp −µ) for fermions . ⎩ p

We give here some explanations relevant to (4.1.6). Here, {np } . . . ≡ $ . . . refers to the multiple sum over all occupation numbers, whereby p np each occupation number np takes on the allowed values (0,1 for fermions and 0,1,2, . . . for bosons). In this expression, p ≡ (p, ms ) runs over all values of p and ms . The calculation of the grand partition function requires that one ﬁrst sum over all the states allowed by a particular value of the particle number N , and 2, . . .. In the deﬁ then over all particle numbers, N = 0, 1, nition of ZG , {np } therefore enters with the constraint p np = N . Since however in the end we must sum over all N , the expression after the second equals sign is obtained; in it, the sum runs over all np independently of one another. Here, we see that it is most straightforward to calculate the grand partition function as compared to the other ensembles. For bosons, a product of geometric series is obtained in (4.1.6); the condition for their convergence requires that µ < εp for all p. The grand potential follows from (4.1.6): (4.1.7) log 1 ∓ e−β(εp −µ) , Φ = −β −1 log ZG = ±β −1 p

from which we can derive all the thermodynamic quantities of interest. Here, and in what follows, the upper (lower) signs refer to bosons (fermions). For the average particle number, we therefore ﬁnd

∂Φ N ≡− = n(εp ) , (4.1.8) ∂µ β p where we have introduced 1 ; (4.1.9) n(εp ) ≡ β(ε −µ) e p ∓1 these are also referred to as the Bose or the Fermi distribution functions. We now wish to show that n(εq ) is the average occupation number of the state |q. To this end, we calculate the average value of nq : P −β p np (εp −µ) −βnq (εq −µ) nq nq {np } e nq e P nq = Tr(ρG nq ) = = −β p np (εp −µ) −βn (ε −µ) q q nq e {np } e ∂ log =− e−xn = n(εq ) , ∂x n

x=β(εq −µ)

172

4. Ideal Quantum Gases

which demonstrates the correctness of our assertion. We now return to the calculation of the thermodynamic quantities. For the internal energy, we ﬁnd from (4.1.7)

∂(Φβ) E= = εp n(εp ) , (4.1.10) ∂β βµ p where in taking the derivative, the product βµ is held constant. Remarks: (i) In order to ensure that n(εp ) ≥ 0 for every value of p, for bosons we require that µ < 0 , and for an arbitrary energy spectrum, that µ < min(εp ). (ii) For e−β(εp −µ) 1 and s = 0, we obtain from (4.1.7) 2 z V zV −1 −β(εp −µ) d3 p e−βp /2m = − 3 , e =− Φ = −β 3 β (2π) βλ p (4.1.11) which is identical to the grand potential of a classical ideal gas, Eq. (2.7.23). Here, the dispersion relation εp = p2 /2m from Eq. (4.1.2c) was used for the right-hand side of (4.1.11). In z = eβµ ,

(4.1.12) √ h 2πmkT

(Eq. (2.7.20)) denotes we have introduced the fugacity, and λ = the thermal wavelength. For s = 0, an additional factor of (2s + 1) would occur after the second and third equals signs in Eq. (4.1.11). (iii) The calculation of the grand partition function becomes even simpler if we make use of the second-quantization formalism ˆ) , ZG = Tr exp −β(H − µN (4.1.13a) where the Hamiltonian and the particle number operator in second quantization2 have the form εp a†p ap (4.1.13b) H= p

and ˆ= N

a†p ap .

(4.1.13c)

p

It then follows that † ZG = Tr e−β(εp −µ)ap ap = e−β(εp −µ)np p

p

(4.1.13d)

np

and thus we once again obtain (4.1.6). 2

See e.g. F. Schwabl, Advanced Quantum Mechanics, 3rd ed. (QM II), Springer, 2005, Chapter 1.

4.1 The Grand Potential

173

According to Eq. (4.1.2b) we may associate with each of the discrete p values a volume element of size ∆ = 2π/L3 . Hence, sums over p may be replaced by integrals in the limit of large V . For the Hamiltonian of free particles (4.1.1), this implies in (4.1.7) and (4.1.8) 1 V d3 p . . . ... = g ... = g ∆... = g (4.1.14a) 3 ∆ (2π) p p p with the degeneracy factor g = 2s + 1 ,

(4.1.14b)

as a result of the spin-independence of the single-particle energy εp . For the average particle number, we then ﬁnd from (4.1.8)3 N=

gV (2π)3

d3 p n(εp ) =

gV 2π 2 3

∞ dp p2 n(εp ) 0

gV m3/2 = 1/2 2 3 2 π

∞ 0

√ ε , eβ(ε−µ) ∓ 1 dε

(4.1.15)

where we have introduced ε = p2 /2m as integration variable. We also deﬁne the speciﬁc volume v = V /N

(4.1.16)

and substitute x = βε, ﬁnally obtaining from (4.1.15) 1 1 2g = 3√ v λ π

∞ 0

g x1/2 = 3 dx x −1 e z ∓1 λ

"

g3/2 (z) for bosons f3/2 (z) for fermions .

(4.1.17)

In this expression, we have introduced the generalized ζ-functions, which are deﬁned by4 gν (z) fν (z)

#

1 ≡ Γ (ν)

∞ dx

xν−1 . ∓1

(4.1.18)

ex z −1

0

Similarly, from (4.1.7), we ﬁnd 3

4

For bosons, we shall see in Sect. 4.4 that in a temperature range where µ → 0, the term with p = 0 must be treated separately in making the transition from the sum over momenta to the integral. R∞ The gamma function is deﬁned as Γ (ν) = dt e−t tν−1 [Re ν > 0]. It obeys the 0

relation Γ (ν + 1) = ν Γ (ν).

174

4. Ideal Quantum Gases

Φ=±

gV (2π)3 β

gV m3/2 = ± 1/2 2 3 2 π β

d3 p log 1 ∓ e−β(εp −µ) ∞ dε

√ ε log 1 ∓ e−β(ε−µ) ,

(4.1.19)

0

which, after integration by parts, leads to 2 gV m3/2 Φ = −P V = − 3 21/2 π 2 3

∞ 0

dε ε3/2 gV kT =− 3 λ eβ(ε−µ) ∓ 1

"

g5/2 (z) , (4.1.19 ) f5/2 (z)

where the upper lines holds for bosons and the lower line for fermions. The expression (3.1.26), Φ = −P V , which is valid for homogeneous systems, was also used here. From (4.1.10) we obtain for the internal energy gV E= (2π)3

gV m3/2 d p εp n(εp ) = 1/2 2 3 2 π

∞

3

dε ε3/2 . ∓1

eβ(ε−µ) 0

(4.1.20)

Comparison with (4.1.19 ) yields, remarkably, the same relation PV =

2 E 3

(4.1.21)

as for the classical ideal gas. Additional general relations follow from the homogeneity of Φ in T and µ. From (4.1.19 ), (4.1.15), and (3.1.18), we obtain µ µ Φ = −T 5/2 ϕ , N = V T 3/2 n , V T T

µ S s(µ/T ) ∂Φ , and = . = V T 3/2 s S=− ∂T V,µ T N n(µ/T )

P =−

(4.1.22a,b) (4.1.22c,d)

Using these results, we can readily derive the adiabatic equation. The conditions S = const. and N = const., together with (4.1.22d), (4.1.22b) and (4.1.22a), yield µ/T = const., V T 3/2 = const., P T −5/2 = const., and ﬁnally P V 5/3 = const .

(4.1.23)

The adiabatic equation has the same form as that for the classical ideal gas, although most of the other thermodynamic quantities show diﬀerent behavior, such as for example cP /cV = 5/3. Following these preliminary general considerations, we wish to derive the equation of state from (4.1.22a). To this end, we need to eliminate µ/T from (4.1.22a) and replace it by the density N/V using (4.1.22b). The explicit computation is carried out in 4.2 for the classical limit, and in 4.3 and 4.4 for low temperatures where quantum eﬀects predominate.

4.2 The Classical Limit z = eµ/kT 1

175

4.2 The Classical Limit z = eµ/kT 1 We ﬁrst formulate the equation of state in the nearly-classical limit. To do this, we expand the generalized ζ-functions g and f deﬁned in (4.1.18) as power series in z: gν (z) fν (z)

#

1 = Γ (ν)

∞

dx xν−1 e−x z

0

∞

(±1)k e−xk z k =

k =0

∞ (±1)k+1 z k k=1

kν

,

(4.2.1) where the upper lines (signs) hold for bosons and the lower for fermions. Then Eq. (4.1.17) takes on the form (±1)k+1 z k 3 λ3 z2 =g . = g z ± + O z v k 3/2 23/2 ∞

(4.2.2)

k=1

This equation can be solved iteratively for z: ) *

3 2 3 1 λ λ3 λ3 ∓ 3/2 . +O z= vg 2 vg v

(4.2.3)

Inserting this in the series for Φ which follows from (4.1.19 ) and (4.2.1),

z2 gV kT z ± 5/2 + O z 3 , (4.2.4) Φ=− 3 λ 2 we can eliminate µ in favor of N and obtain the equation of state ) 2 ** ) λ3 λ3 . P V = −Φ = N kT 1 ∓ 5/2 + O v 2 gv

(4.2.5)

The symmetrization (antisymmetrization) of the wavefunctions causes a reduction (increase) in the pressure in comparison to the classical ideal gas. This acts like an attraction (repulsion) between the particles, which in fact are non-interacting (formation of clusters in the case of bosons, exclusion principle for fermions). For the chemical potential, we ﬁnd from (4.1.12) and 3 (4.2.3), and making use of λvg 1, the following expansion: 1 λ3 λ3 ∓ 3/2 ... , (4.2.6) µ = kT log z = kT log gv 2 gv i.e. µ < 0. Furthermore, for the free energy F = Φ + µN , we ﬁnd from (4.2.5) and (4.2.6) F = Fclass ∓ kT

N λ3 , 25/2 gv

(4.2.7a)

176

4. Ideal Quantum Gases

where Fclass = N kT

λ3 −1 + log gv

(4.2.7b)

is the free energy of the classical ideal gas. Remarks: (i) The quantum corrections are proportional to 3 , since λ is proportional to . These corrections are also called exchange corrections, as they depend only on the symmetry behavior of the wavefunctions (see also Appendix B). (ii) The exchange corrections to the classical results at ﬁnite temperatures are of the order of λ3 /v. The classical equation of state holds for z 1 or λ v 1/3 , i.e. in the extremely dilute limit. This limit is the more readily reached, the higher the temperature and the lower the density. The occupation number in the classical limit is given by (cf. Fig. 4.1) n(εp ) ≈ e−βεp eβµ = e−βεp

λ3 1. gv

(4.2.8)

This classical limit (4.2.8) is equally valid for bosons and fermions. For comparison, the Fermi distribution at T = 0 is also shown. Its signiﬁcance, as well as that of εF , will be discussed in Sect. 4.3 (Fig. 4.1). (iii) Corresponding to the symmetry-dependent pressure change in (4.2.5), the exchange eﬀects lead to a modiﬁcation of the free energy (4.2.7a).

Fig. 4.1. The occupation number n(ε) in the classical limit (shaded). For comparison, the occupation of a degenerate Fermi gas is also indicated

4.3 The Nearly-degenerate Ideal Fermi Gas In this and the following section, we consider the opposite limit, in which quantum eﬀects are predominant. Here, we must treat fermions and bosons separately in Sect. 4.4. We ﬁrst recall the properties of the ground state of fermions, independently of their statistical mechanics.

4.3 The Nearly-degenerate Ideal Fermi Gas

177

4.3.1 Ground State, T = 0 (Degeneracy) We ﬁrst deal with the ground state of a system of N fermions. It is obtained at a temperature of zero Kelvin. In the ground state, the N lowest singleparticle states |p are each singly occupied. If the energy depends only on the momentum p, every value of p occurs g-fold. For the dispersion relation (4.1.2c), all the momenta within a sphere (the Fermi sphere), whose radius is called the Fermi momentum pF (Fig. 4.2), are thus occupied. The particle number is related to pF as follows: V gV p3 N =g d3 p Θ(pF − p) = 2 F3 . 1=g (4.3.1) 3 (2π) 6π p≤pF

Fig. 4.2. The occupation of the momentum states within the Fermi sphere

From (4.3.1), we ﬁnd the following relation between the particle density n= N V and the Fermi momentum:

pF =

6π 2 g

1/3 n1/3 .

(4.3.2)

The single-particle energy corresponding to the Fermi momentum is called the Fermi energy: εF =

p2F = 2m

6π 2 g

2/3

2 2/3 n . 2m

(4.3.3)

For the ground-state energy, we ﬁnd E=

gV (2π)3

d3 p

3 p2 gV p5F Θ(pF − p) = = εF N . 2 3 2m 20π m 5

(4.3.4)

178

4. Ideal Quantum Gases

From (4.1.21) and (4.3.4), the pressure of fermions at T = 0 is found to be P =

2 1 εF n = 5 5

6π 2 g

2/3

2 5/3 n . m

(4.3.5)

The degeneracy of the ground state is suﬃciently small that the entropy and the product T S vanish at T = 0 (see also (4.3.19)). From this, and using (4.3.4) and (4.3.5), we obtain for the chemical potential using the Gibbs– Duhem relation µ = N1 (E + P V − T S): µ = εF .

(4.3.6)

This result is also evident from the form of the ground state, which implies the occupation of all the levels up to the Fermi energy, from which it follows that the Fermi distribution of a system of N fermions at T = 0 becomes n(ε) = Θ(εF − ε). Clearly, one requires precisely the energy εF in order to put one additional fermion into the system. The existence of the Fermi energy is a result of the Pauli principle and is thus a quantum eﬀect. 4.3.2 The Limit of Complete Degeneracy We now calculate the thermodynamic properties in the limit of large µ/kT . In Fig. 4.3, the Fermi distribution function n(ε) =

1 e(ε−µ)/kT

+1

(4.3.7)

is shown for low temperatures. In comparison to a step function at the position µ, it is broadened within a region kT . We shall see below that µ is equal to εF only at T = 0. For T = 0, the Fermi distribution function degenerates into a step function, so that one then speaks of a degenerate Fermi gas; at low T one refers to a nearly-degenerate Fermi gas. It is expedient to replace the prefactors in (4.1.19 ) and (4.1.15) with the Fermi energy (4.3.3)5 ; for the grand potential, one then obtains Φ=

−3/2 −N εF

∞ dε ε3/2 n(ε) ,

(4.3.8)

0

and the formula for N becomes 3 −3/2 1 = εF 2

∞ dε ε1/2 n(ε) .

(4.3.9)

0 5

In (4.3.8) and (4.3.14), Φ is expressed as usual in terms of its natural variables −3/2 ∝ V . In (4.3.14 ), the dependence on µ has been T, V and µ, since N εF substituted by T and N/V , using (4.3.13).

4.3 The Nearly-degenerate Ideal Fermi Gas

179

Fig. 4.3. The Fermi distribution function n(ε) for low temperatures, compared with the step function Θ(µ − ε).

Fig. 4.4. The Fermi distribution function n(ε), and n(ε) − Θ(µ − ε).

There thus still remain integrals of the type ∞ I=

dε f (ε) n(ε)

(4.3.10)

0

to be computed. The method of evaluation at low temperatures was given by Sommerfeld; I can be decomposed in the following manner: µ

∞ dε f (ε) +

I=

dε f (ε) n(ε) − Θ(µ − ε)

0

0

µ

∞

≈

dε f (ε) + 0

dε f (ε) n(ε) − Θ(µ − ε)

(4.3.11)

−∞

and for T → 0, the limit of integration in the second term can be extended to −(µ−ε)/kT 6 −∞ to a good approximation, since for negative ). ε , n(ε) = 1+O(e One can see immediately from Fig. 4.4 that n(ε)−Θ(µ−ε) diﬀers from zero only in the neighborhood of ε = µ and is antisymmetric around µ.7 Therefore, 6

7

If f (ε) is in principle deﬁned only for positive ε, one can e.g. deﬁne f (−ε) = f (ε); the result depends on f (ε) only for positive h ε. i 1 1 1 − Θ(−x) = 1 − − Θ(−x) = − − Θ(x) . ex +1 e−x +1 e−x +1

180

4. Ideal Quantum Gases

we expand f (ε) around the value µ in a Taylor series and introduce a new integration variable, x = (ε − µ)/kT : µ I=

∞ dε f (ε) + −∞

0

dx

1 − Θ(−x) × ex + 1

2 f (µ) 4 3 × f (µ) kT x + kT x + . . . 3! µ ∞ 2 x + = dε f (ε) + 2 kT f (µ) dx x e +1 0

0

4 ∞ 2 kT x3 f (µ) dx x + ... + 3! e +1 0

(since ex1+1 − Θ(−x) is antisymmetric and = ex1+1 for x > 0). From this, the general expansion in terms of the temperature follows, making use of the integrals computed in Appendix D., Eq. (D.7) 8 µ I=

dε f (ε) +

π 2 2 7π 4 4 kT f (µ) + kT f (µ) + . . . . 6 360

(4.3.12)

0

Applying this expansion to Eq. (4.3.9), we ﬁnd

3/2 "

2 # µ π 2 kT 1= 1+ . + O T4 εF 8 µ This equation can be solved iteratively for µ, yielding the chemical potential as a function of T and N/V :

" 2 # π 2 kT µ = εF 1 − , (4.3.13) + O T4 12 εF where εF is given by (4.3.3). The chemical potential decreases with increasing temperature, since then no longer all the states within the Fermi sphere are occupied. In a similar way, we ﬁnd for (4.3.8) " # π 2 2 3 1/2 −3/2 2 5/2 Φ = −N εF µ + kT µ + ... , (4.3.14) 5 6 2 8

This series is P an asymptotic expansion in T . An asymptotic series for a function k I(λ), I(λ) = m k=0 ak λ + Rm (λ), is characterized by the following behavior of the remainder: limλ→0 Rm (λ)/λm = 0, limm→∞ Rm (λ) = ∞. For small values of λ, the function can be represented very accurately by a ﬁnite number of terms in the series. The fact that the integral in (4.3.10) for functions f (ε) ∼ ε1/2 etc. cannot be expanded in a Taylor series can be immediately recognized, since I diverges for T < 0.

4.3 The Nearly-degenerate Ideal Fermi Gas

181

from which, inserting (4.3.13),9

" 2 # 5π 2 kT 2 Φ = − N εF 1 + + O T4 5 12 εF

(4.3.14 )

or using P = −Φ/V , we obtain the equation of state. From (4.1.21), we ﬁnd immediately the internal energy

" 2 # 4 3 3 5π 2 kT E = P V = N εF 1 + . +O T 2 5 12 εF

(4.3.15)

From this, we calculate the heat capacity at constant V and N : CV = N k

π2 T , 2 TF

(4.3.16)

where we have introduced the Fermi temperature TF = εF /k .

(4.3.17)

At low temperatures, (T TF ), the heat capacity is a linear function of the temperature (Fig. 4.5). This behavior can be qualitatively understood in a simple way: if one increases the temperature from zero to T , the energy of a portion of the particles increases by kT . The number of particles which are excited in this manner is limited to a shell of thickness kT around the Fermi sphere, i.e. it is given by N kT /εF . All together, the energy increase is δE ∼ kT N

kT , εF

(4.3.16 )

from which, as in (4.3.16), we obtain CV ∼ kN T /TF . According to (4.3.14 ), the pressure is given by P =

2 5

6π 2 g

2/3

2 2m

N V

5/3 1+

5π 2 12

kT εF

2 + ... .

(4.3.14 )

Due to the Pauli exclusion principle, there is a pressure increase at T = 0 relative to a classical ideal gas, as can be seen in Fig. 4.6. The isothermal compressibility is then 1 κT = − V 9

∂V ∂P

T

2 π 2 kT 3(V /N ) 1− = + ... . 2εF 12 εF

(4.3.18)

If one requires the grand potential as a function of its natural variables, it is nec−3/2 = V g(2m)3/2 /6π 2 3 in (4.3.14). For the calculation essary to substitute N εF of CV and the equation of state, it is however expedient to employ T, V , and N as variables.

182

4. Ideal Quantum Gases

Fig. 4.5. The speciﬁc heat (heat capacity) of the ideal Fermi gas

Fig. 4.6. The pressure as a function of the temperature for the ideal Fermi gas (solid curve) and the ideal classical gas (dashed)

For the entropy, we ﬁnd for T TF S = kN

π2 T 2 TF

(4.3.19)

with T S = E + P V − µN from (4.3.15), (4.3.14 ) and (4.3.13) (cf. Appendix A.1, ‘Third Law’). The chemical potential of an ideal Fermi gas with a ﬁxed density can be found from Eq. (4.3.9) and is shown in Fig. 4.7 as a function of the temperature.

µ/ εF

1

0

-1

0.5

1.0

1.5

kT/ εF

Fig. 4.7. The chemical potential of the ideal Fermi gas at ﬁxed density as a function of the temperature.

Addenda: (i) The Fermi temperature, also known as the degeneracy temperature, „ «2/3 εF 1 N TF [K] = (4.3.20) = 3.85 × 10−38 k m[g] V [cm3 ] characterizes the thermodynamic behavior of fermions (see Table 4.1). For T TF , the system is nearly degenerate, while for T TF , the classical limit applies. Fermi energies are usually quoted in electron volts (eV). Conversion to Kelvins is ∧ 11605 K . accomplished using 1 eV=

4.3 The Nearly-degenerate Ideal Fermi Gas (ii) The density of states is deﬁned as Z Vg d3 p δ(ε − εp ) . ν(ε) = (2π)3

183

(4.3.21)

We note that ν(ε) is determined merely by the dispersion relation and not by statistics. The thermodynamic quantities do not depend on the details of the momentum dependence of the energy levels, but only on their distribution, i.e. on the density of states. Integrals over momentum space, whose integrands depend only on εp , can be rearranged as follows: Z Z Z Z (2π)3 dε ν(ε)f (ε) . d3 p f (ε)δ(ε − εp ) = d3 p f (εp ) = dε Vg For example, the particle number can be expressed in terms of the density of states in the form Z∞ N=

dε ν(ε)n(ε) .

(4.3.22)

−∞

For free electrons, we ﬁnd from (4.3.21) gV ν(ε) = 4π 2

„

2m 2

«3 2

ε1/2 =

3 ε1/2 . N 2 ε3/2

(4.3.23)

F

The dependence on ε1/2 shown in Fig. 4.8 is characteristic of nonrelativistic, noninteracting material particles.

Fig. 4.8. The density of states for free electrons in three dimensions The derivations of the speciﬁc heat and the compressibility given above can be generalized to the case of arbitrary densities of states ν(ε) by evaluating (4.3.9) and (4.3.8) in terms of a general ν(ε). The results are CV =

` ´ 1 2 π ν(εF )k2 T + O (T /TF )3 3

(4.3.24a)

κT =

` ´ V ν(εF ) + O (T /TF )2 . N2

(4.3.24b)

and

184

4. Ideal Quantum Gases

The fact that only the value of the density of states at the Fermi energy is of importance for the low-temperature behavior of the system was to be expected after the discussion following equation (4.3.17). For (4.3.23), we ﬁnd from (4.3.24a,b) once again the results (4.3.16) and (4.3.18). (iii) Degenerate Fermi liquids: physical examples of degenerate Fermi liquids are listed in Table 4.1. Table 4.1. Degenerate Fermi liquids: mass, density, Fermi temperature, Fermi energy

3

−27

0.91 × 10

He, P = 0–30 bar

Neutrons in the Nucleus Protons in the Nucleus Electrons in White Dwarf Stars

N/V [cm−3 ]

m[g]

Particles Metal electrons

24

10

5.01 × 10−24 (1.6–2.3)×1022 m∗/m = 2.8–5.5 0.11×1039 ´ ` × A−Z A

1.67 × 10−24 1.67 × 10−24

TF [K] 10

< 10

1.7–1.1

(1.5–0.9)×10−4

5.3×1011 ` ´ 23 × A−Z A

` ´2 3 46 A−Z × 106 A

Z 0.11 × 1039 A 5.3 × 1011

0.91 × 10−27

1030

εF [eV]

5

`Z ´2

3 × 109

3

A

46

`Z ´2 3

A

× 106

3 × 105

(iv) Coulomb interaction: electrons in metals are not free, but rather they repel each other as a result of their Coulomb interactions H=

p2 1 e2 i + . 2m 2 rij i

(4.3.25)

i =j

The following scaling of the Hamiltonian shows that the approximation of free electrons is particularly reasonable for high densities. To see this, we carry out the canonical transformation r = r/r0 , p = p r0 . The characteristic 3V 1/3 3 . In terms of these new length r0 is deﬁned by 4π 3 r0 N = V , i.e. r0 = 4πN variables, the Hamiltonian is

2 1 e2 1 pi H= 2 + r0 . r0 2m 2 rij i

(4.3.25 )

i =j

The Coulomb interaction becomes less and less important relative to the kinetic energy the smaller r0 , i.e. the more dense the gas becomes.

4.3 The Nearly-degenerate Ideal Fermi Gas ∗

185

4.3.3 Real Fermions

In this section, we will consider real fermionic many-body systems: the conduction electrons in metals, liquid 3 He, protons and neutrons in atomic nuclei, electrons in white dwarf stars, neutrons in neutron stars. All of these fermions interact; however, one can understand many of their properties without taking their interactions into account. In the following, we will deal with the parameters mass, Fermi energy. and temperature and discuss the modiﬁcations which must be made as a result of the interactions (see also Table 4.1). a) The Electron Gas in Solids The alkali metals Li, Na, K, Rb, and Cs are monovalent (with a body-centered cubic crystal structure); e.g. Na has a single 3s1 electron (Table 4.2). The noble metals (face-centered cubic crystal structure) are Copper Cu 4s1 3d10 Silver Ag 5s1 4d10 Gold Au 6s1 5d10 . All of these elements have one valence electron per atom, which becomes a conduction electron in the metal. The number of these quasi-free electrons is equal to the number of atoms. The energy-momentum relation is to a good p2 10 approximation parabolic, εp = 2m . Table 4.2. Electrons in Metals; Element, Density, Fermi Energy, Fermi Temperature, γ/γtheor. , Eﬀective Mass N/V [cm−3 ]

εF [eV]

TF [K]

γ/γtheor.

m∗ /m

Li Na K Rb Cs

4.6 × 1022 2.5 1.34 1.08 0.86

4.7 3.1 2.1 1.8 1.5

5.5 × 104 3.7 2.4 2.1 1.8

2.17 1.21 1.23 1.22 1.35

2.3 1.3 1.2 1.3 1.5

Cu Ag Au

8.5 5.76 5.9

7 5.5 5.5

8.2 6.4 6.4

1.39 1.00 1.13

1.3 1.1 1.1

10

Remark concerning solid-state physics applications: for Na, we have 4π ( pF )3 = 3 4π 3 N 1 = 2 VBrill. , where VBrill. is the volume of the ﬁrst Brillouin zone. The Fermi V sphere always lies within the Brillouin zone and thus never crosses the zone boundary, where there are energy gaps and deformations of the Fermi surface. The Fermi surface is therefore in practice spherical, ∆pF /pF ≈ 10−3 . Even in copper, where the 4s Fermi surface intersects the Brillouin zone of the fcc lattice, the Fermi surface remains in most regions spherical to a good approximation.

186

4. Ideal Quantum Gases

Fig. 4.9. The experimental determination of γ from the speciﬁc heat of gold (D. L. Martin, Phys. Rev. 141, 576 (1966); ibid. 170, 650 (1968))

Taking account of the electron-electron interactions requires many-body methods, which are not at our disposal here. The interaction of two electrons is weakened by screening from the other electrons; in this sense, it is understandable that the interactions can be neglected to a ﬁrst approximation in treating many phenomena (e.g. Pauli paramagnetism; but not ferromagnetism). The total speciﬁc heat of a metal is composed of a contribution from the electrons (Fig. 4.9) and from the phonons (lattice vibrations, see Sect 4.6): CV = γT + DT 3 . N CV 2 2 Plotting N T = γ + DT vs. T , we can read γ oﬀ the ordinate. From (4.3.16), 2 2 the theoretical value of γ is γtheor = π2εkF . The deviations between theory and experiment can be attributed to the fact that the electrons move in the potential of the ions in the crystal and are subject to the inﬂuence of the electron-electron interaction. The potential and the electron-electron interaction lead among other things to an eﬀective mass m∗ for the electrons, p2 i.e. the dispersion relation is approximately given by εp = 2m ∗ . This eﬀective mass can be larger or smaller than the mass of free electrons.

b) The Fermi Liquid 3 He 3 He has a nuclear spin of I = 12 , a mass m = 5.01 × 10−24g, a particle density of n = 1.6 × 1022 cm−3 at P = 0, and a mass density of 0.081 g cm−3. It follows that εF = 4.2 × 10−4 eV and TF = 4.9 K. The interactions of the 3 He

4.3 The Nearly-degenerate Ideal Fermi Gas

187

Fig. 4.10. The phase diagram of 3 He

atoms lead to an eﬀective mass which at the pressures P = 0 and P = 30 bar is given by m∗ = 2.8 m and m∗ = 5.5 m. Hence the Fermi temperature for P = 30, TF ≈ 1 K, is reduced relative to a ﬁctitious non-interacting 3 He gas. The particle densities at these pressures are n = 1.6 × 1023 cm−3 and n = 2.3 × 1022 cm−3 . The interaction between the helium atoms is shortranged, in contrast to the electron-electron interaction. The small mass of the helium atoms leads to large zero-point oscillations; for this reason, 3 He, like 4 He, remains a liquid at pressures below ∼ 30 bar, even at T → 0. 3 He and 4 He are termed quantum liquids. At 10−3 K, a phase transition into the superﬂuid state takes place (l = 1, s = 1) with formation of BCS pairs.11 In the superconductivity of metals, the Cooper pairs formed by the electrons have l = 0 and s = 0. The relatively complex phase diagram of 3 He is shown in Fig. 4.10.11 c) Nuclear Matter A further example of many-body systems containing fermions are the neutrons and protons in the nucleus, which both have masses of about m = 1.67 × 10−24 g. The nuclear radius depends on the nucleon number A via 4π 3 3 R = 1.3 × 10−13 A1/3 cm. The nuclear volume is V = 4π 3 R = 3 (1.3) × −39 3 −39 3 10 A cm = 9.2 × 10 A cm . A is the overall number of nucleons and Z the number of protons in the nucleus. Nuclear matter12 occurs not only within large atomic nuclei, but also in neutron stars, where however also the gravitational interactions must be taken into account.

11

12

D. Vollhardt and P. W¨ olﬂe, The Superﬂuid Phases of Helium 3, Taylor & Francis, London, 1990 A. L. Fetter and J. D. Walecka, Quantum Theory of Many-Particle Systems, McGraw-Hill, New York 1971

188

4. Ideal Quantum Gases

d) White Dwarfs The properties of the (nearly) free electron gas are indeed of fundamental importance for the stability of the white dwarfs which can occur at the ﬁnal stages of stellar evolution.13 The ﬁrst such white dwarf to be identiﬁed, Sirius B, was predicted by Bessel as a companion of Sirius. Mass ≈ M = 1.99 × 1033 g Radius 0.01R , R = 7 × 1010 cm Density ≈ 107 ρ = 107 g/cm3 , ρ = 1g/cm3 ρSirius B ≈ 0.69 × 105 g/cm3 Central temperature ≈ 107 K ≈ T White dwarfs consist of ionized nuclei and free electrons. Helium can still be burned in white dwarfs. The Fermi temperature is TF ≈ 3 · 109 K, so that the electron gas is highly degenerate. The high zero-point pressure of the electron gas opposes the gravitational attraction of the nuclei which compresses the star. The electrons can in fact be regarded as free; their Coulomb repulsion is negligible at these high pressures. ∗

e) The Landau Theory of Fermi Liquids

The characteristic temperature dependences found for ideal Fermi gases at low temperatures remain in eﬀect in the presence of interactions. This is the result of Landau’s Fermi liquid theory, which is based on physical arguments that can also be justiﬁed in terms of microscopic quantum-mechanical manybody theory. We give only a sketch of this theory, including its essential results, and refer the reader to more detailed literature14 . One ﬁrst considers 13

14

An often-used classiﬁcation of the stars in astronomy is based on their positions in the Hertzsprung–Russell diagram, in which their magnitudes are plotted against their colors (equivalent to their surface temperatures). Most stars lie on the so called main sequence. These stars have masses ranging from about one tenth of the Sun’s mass up to a sixty-fold solar mass in the evolutionary stages in which hydrogen is converted to helium by nuclear fusion (‘burning’). During about 90% of their evolution, the stars stay on the main sequence – as long as nuclear fusion and gravitational attraction are in balance. When the fusion processes come to an end as their ‘fuel’ is exhausted, gravitational forces become predominant. In their further evolution, the stars become red giants and ﬁnally contract to one of the following end stages: in stars with less than 1.4 solar masses, the compression process is brought to a halt by the increase of the Fermi energy of the electrons, and a white dwarf is formed, consisting mainly of helium and electrons. Stars with two- or threefold solar masses end their contraction after passing through intermediate phases as neutron stars. Above three or four solar masses, the Fermi energy of the neutrons is no longer able to stop the compression process, and a black hole results. A detailed description of Landau’s Fermi liquid theory can be found in D. Pines and P. Nozi`eres, The Theory of Quantum Liquids, W. A. Benjamin, New York 1966, as well as in J. Wilks, The Properties of Liquid and Solid Helium, Clarendon Press, Oxford, 1967. See also J. Wilks and D. S. Betts, An Introduction to Liquid Helium, Oxford University Press, 2nd ed., Oxford, (1987).

4.3 The Nearly-degenerate Ideal Fermi Gas

189

the ground state of the ideal Fermi gas, and the ground state with an additional particle (of momentum p); then the interaction is ‘switched on’. The ideal ground state becomes a modiﬁed ground state and the state with the additional particle becomes the modiﬁed ground state plus an excited quantum (a quasiparticle of momentum p). The energy of the quantum, ε(p), is shifted relative to ε0 (p) ≡ p2 /2m. Since every non-interacting single-particle state is only singly occupied, there are also no multiply-occupied quasiparticle states; i.e. the quasiparticles also obey Fermi–Dirac statistics. When several quasiparticles are excited, their energy also depends upon the number δn(p) of the other excitations ε(p) = ε0 (p) + F (p, p )δn(p ) . (4.3.26) p

The average occupation number takes a similar form to that of ideal fermions, owing to the fermionic character of the quasiparticles: np =

1 e(ε(p)−µ)/kT

+1

,

(4.3.27)

where, according to (4.3.26), ε(p) itself depends on the occupation number. This relation is usually derived in the present context by maximizing the entropy expression found in problem 4.2, which can be obtained from purely combinatorial considerations. At low temperatures, the quasiparticles are excited only near the Fermi energy, and due to the occupied states and energy conservation, the phase space for scattering processes is severely limited. Although the interactions are by no means necessarily weak, the scattering rate vanishes with temperature as τ1 ∼ T 2 , i.e. the quasiparticles are practically stable particles. The interaction between the quasiparticles can be written in the form F (p, σ; p , σ ) = f s (p, p ) + σ · σ f a (p, p )

(4.3.28a)

with the Pauli spin matrices σ. Since only momenta in the neighborhood of the Fermi momentum contribute, we introduce f s,a (p, p ) = f s,a (χ)

(4.3.28b)

and F s,a (χ) = ν(εF )f s,a (χ) =

V m∗ pF s,a f (χ) , π 2 3

(4.3.28c)

where χ is the angle between p and p and ν(εF ) is the density of states. A series expansion in terms of Legendre polynomials leads to s,a F s,a (χ) = Fl Pl (cos χ) = 1 + F1s,a cos χ + . . . . (4.3.28d) l

The Fls and Fla are the spin-symmetric and spin-antisymmetric Landau parameters; the Fla result from the exchange interaction.

190

4. Ideal Quantum Gases

Due to the Fermi character of the quasiparticles, which at low temperatures can be excited only near the Fermi energy, it is clear from the qualitative estimate (4.3.16 ) that the speciﬁc heat of the Fermi liquid will also have a linear temperature dependence. In detail, one obtains for the speciﬁc heat, the compressibility, and the magnetic susceptibility: 1 2 π ν(εF ) k 2 T , 3 V ν(εF ) κT = 2 , N 1 + F0s ν(εF )N χ = µ2B , 1 + F0a

CV =

with the density of states ν(εF ) =

(4.3.29a) (4.3.29b) (4.3.29c) V m∗ pF π 2 3

and the eﬀective mass ratio

1 m∗ = 1 + F1s . m 3

(4.3.29d)

The structure of the results is the same as for ideal fermions.

4.4 The Bose–Einstein Condensation In this section, we investigate the low-temperature behavior of a nonrelativistic ideal Bose gas of spin s = 0, i.e. g = 1 and εp =

p2 . 2m

(4.4.1)

In their ground state, non-interacting bosons all occupy the energetically lowest single-particle state; their low-temperature behavior is therefore quite diﬀerent from that of fermions. Between the high-temperature phase, where the bosons are distributed over the whole spectrum of momentum values, corresponding to the Bose distribution function, and the phase in which the (p = 0) state is macroscopically occupied (at T = 0, all the particles are in this state), a phase transition takes place. This so called Bose–Einstein condensation of an ideal Bose gas was predicted by Einstein15 on the basis of the statistical considerations of Bose, nearly seventy years before it was observed experimentally. We ﬁrst refer to the results of Sect 4.1, where we found for the particle density, i.e. for the reciprocal of the speciﬁc volume, in Eq. (4.1.17): λ3 = g3/2 (z) v 15

(4.4.2a)

A. Einstein, Sitzber. Kgl. Preuss. Akad. Wiss. 1924, 261, (1924), ibid. 1925, 3 (1925); S. Bose, Z. Phys. 26, 178 (1924)

4.4 The Bose–Einstein Condensation

191

with λ = 2π/mkT and, using (4.2.1), 2 g3/2 (z) = √ π

∞

∞

dx

zk x1/2 = . x −1 e z −1 k 3/2

(4.4.2b)

k=1

0

According to Remark (i) in Sect. 4.1, the fugacity of bosons z = eµ/kT is limited to z ≤ 1. The maximum value of the function g3/2 (z), which is shown in Fig. 4.11, is then given by g3/2 (1) = ζ(3/2) = 2.612.

Fig. 4.11. The function g3/2 (z).

Fig. 4.12. The fugacity z as a function of v/λ3

In the following, we take the particle number and the volume, and thus the speciﬁc volume v, to be ﬁxed at given values. Then from Eq. (4.4.2a), we can calculate z as a function of T , or, more expediently, of vλ−3 . On lowering the 1 temperature, λv3 decreases and z therefore increases, until ﬁnally at λv3 = 2.612 it reaches its maximum value z = 1 (Fig. 4.12). This deﬁnes a characteristic temperature kTc (v) =

2π2 /m . (2.612 v)2/3

(4.4.3)

When z approaches 1, we must be more careful in taking the limit of p → 3 d p used in (4.1.14a) and (4.1.15). This is also indicated by the fact that (4.4.2a) would imply for z = 1 that at temperatures below Tc (v), the density 1 v must decrease with decreasing temperature. From (4.4.2a), there would appear to no longer be enough space for all the particles. Clearly, we have to treat the (p = 0) term in the sum in (4.1.8), which diverges for z → 1, separately: 1 V 1 N = −1 + + d3 p n(εp ) . n(εp ) = −1 z −1 z − 1 (2π)3 p =0

The p = 0 state for fermions did not require any special treatment, since the average occupation numbers can have at most the value 1. Even for bosons, this modiﬁcation is important only at T < Tc (v) and leads at T = 0 to the complete occupation of the p = 0 state, in agreement with the ground state which we described above.

192

4. Ideal Quantum Gases

We thus obtain for bosons, instead of (4.4.2a): v 1 + N 3 g3/2 (z) , z −1 − 1 λ

N=

(4.4.4)

or, using Eq. (4.4.3), 1 +N N = −1 z −1

T Tc (v)

3/2

g3/2 (z) . g3/2 (1)

(4.4.4 )

The overall particle number N is thus the sum of the number of particles in the ground state N0 =

1 z −1 − 1

(4.4.5a)

and the numbers in the excited states 3/2

g3/2 (z) T . N =N Tc (v) g3/2 (1)

(4.4.5b)

For T > Tc (v), Eq. (4.4.4 ) yields a value for z of z < 1. The ﬁrst term on the right-hand side of (4.4.4 ) is therefore ﬁnite and can be neglected relative to N . Our initial considerations thus hold here; in particular, z follows from 3/2

Tc (v) g3/2 (z) = 2.612 for T > Tc (v) . (4.4.5c) T For T < Tc (v), from Eq. (4.4.4 ), z = 1 − O(1/N ), so that all of the particles which are no longer in excited states can ﬁnd suﬃcient ‘space’ to enter the ground state. When z is so close to 1, we can set z = 1 in the second term and obtain ) 3/2 *

T N0 = N 1 − . Tc (v) Deﬁning the condensate fraction in the thermodynamic limit by ν0 = lim

N →∞

v ﬁxed

N0 , N

we ﬁnd in summary 0 3/2 ν0 = 1 − TcT(v)

(4.4.6)

T > Tc (v) T < Tc (v) .

(4.4.7)

This phenomenon is called the Bose–Einstein condensation. Below Tc (v), the ground state p = 0 is macroscopically occupied. The temperature depen√ √ dence of ν0 and ν0 is shown in Fig. 4.13. The quantities ν0 and ν0 are

4.4 The Bose–Einstein Condensation

Fig. 4.13. The relative number of particles in the condensate and its square root as functions of the temperature

193

Fig. 4.14. The transition temperature as a function of the speciﬁc volume

characteristic of the condensation or the ordering of the system. For reasons √ which will become clear later, one refers to ν0 as the order parameter. In √ the neighborhood of Tc , ν0 goes to zero as √ ν0 ∝ Tc − T . (4.4.7 ) In Fig. 4.14, we show the transition temperature as a function of the speciﬁc volume. The higher the density (i.e. the smaller the speciﬁc volume), the higher the transition temperature Tc (v) at which the Bose–Einstein condensation takes place. P Remark: One might ask whether the next higher terms in the sum p n(εp ) could not also be macroscopically occupied. The following estimate ` however ´ shows that n(εp ) n(0) for p = 0. Consider e.g. the momentum p = 2π , 0, 0 , for which L 1 2m 1 1 1 < < ∼ O(V −1/3 ) V eβp21 /2m z −1 − 1 V eβp21 /2m − 1 V βp21 holds, while

1 1 V z −1 −1

∼ O(1) .

There is no change in the grand potential compared to the integral representation (4.1.19 ), since for the term with p = 0 in the thermodynamic limit, it follows that 1 1 1 lim log(1 − z(V )) = lim log = 0 . V →∞ V V →∞ V V Therefore, the pressure is given by (4.1.19 ) as before, where z for T > Tc (v) follows from (4.4.5c), and for T < Tc (v) it is given by z = 1. Thus ﬁnally the pressure of the ideal Bose gas is ⎧ kT ⎪ ⎪ T > Tc ⎪ ⎨ λ3 g5/2 (z) P = , (4.4.8) ⎪ ⎪ kT ⎪ ⎩ 1.342 T < Tc λ3

194

4. Ideal Quantum Gases

Fig. 4.15. The functions g3/2 (z) and g5/2 (z). In the limit z → 0, the functions become asymptotically identical, g3/2 (z) ≈ g5/2 (z) ≈ z.

Fig. 4.16. The equation of state of the ideal Bose gas. The isochores are shown for decreasing values of v. For T < Tc (v), the pressure is P = kT 1.342. λ3

with g5/2 (1) = ζ 52 = 1.342. If we insert z from (4.4.4) here, we obtain the equation of state. For T > Tc , using (4.4.5c), we can write (4.4.8) in the form P =

kT g5/2 (z) . v g3/2 (z)

(4.4.9)

The functions g5/2 (z) and g3/2 (z) are drawn in Fig. 4.15. The shape of the equation of state can be qualitatively seen from them. For small values of z, g5/2 (z) ≈ g3/2 (z), so that for large v and high T , we obtain again from (4.4.9) the classical equation of state (see Fig. 4.16). On approaching Tc (v), it becomes increasingly noticeable that g5/2 (z) < g3/2 (z). At Tc (v), the isochores converge into the curve P = kT λ3 1.342, which represents the pressure for T < Tc (v). All together, this leads to the equation of state corresponding to the isochores in Fig. 4.16. For the entropy, we ﬁnd16

⎧ 5 v ⎪ ⎪ N k g (z) − log z T > Tc 5/2 ⎪ ⎪

2 λ3 ⎨ ∂P V S= = , (4.4.10)

3/2 ∂T V,µ ⎪ ⎪ (1) g 5 T ⎪ 5/2 ⎪ ⎩ Nk T < Tc 2 g3/2 (1) Tc 16

Note that

d g (z) dz ν

= z1 gν−1 (z).

4.4 The Bose–Einstein Condensation

195

Fig. 4.17. The heat capacity = N × the speciﬁc heat of an ideal Bose gas

2

µ/kTc(υ)

1 0 -1

1

2

3

T/Tc

-2 -3

Fig. 4.18. The chemical potential of the ideal Bose gas at a ﬁxed density as a function of the temperature

and, after some calculation, we obtain for the heat capacity at constant volume ⎧ 15 v 9 g3/2 (z) ⎪ ⎪ ⎪ T > Tc g5/2 (z) − ⎪ 3 ⎪

4 g1/2 (z) ⎨ 4 λ ∂S CV = T = Nk . (4.4.11)

3/2 ⎪ ∂T N,V ⎪ (1) g ⎪ 15 T 5/2 ⎪ ⎪ T < Tc . ⎩ 4 g3/2 (1) Tc The entropy and the speciﬁc heat vary as T 3/2 at low T . Only the excited states contribute to the entropy and the internal energy; the entropy of the condensate is zero. At T = Tc , the speciﬁc heat of the ideal Bose gas has a cusp (Fig. 4.17). From Eq. (4.4.4) or from Fig. 4.12, one can obtain the chemical potential, shown in Fig. 4.18 as a function of the temperature. At Tλ = 2.18 K, the so called lambda point, 4 He exhibits a phase transition into the superﬂuid state (see Fig. 4.19). If we could neglect the interactions of the helium atoms, the temperature of a Bose–Einstein condensation would be Tc (v) = 3.14 K, using the speciﬁc volume of helium in (4.4.3). The interactions are however very important, and it would be incorrect to identify the phase transition into the superﬂuid state with the Bose–Einstein

196

4. Ideal Quantum Gases

Fig. 4.19. The phase diagram of 4 He (schematic). Below 2.18 K, a phase transition from the normal liquid He I phase into the superﬂuid He II phase takes place

Fig. 4.20. The experimental speciﬁc heat of 4 He, showing the characteristic lambda anomaly

condensation treated above. The superﬂuid state in three-dimensional helium is indeed also created by a condensation (macroscopic occupation) of the p = 0 state, but at T = 0, the fraction of condensate is only 8%. The speciﬁc heat (Fig. 4.20) exhibits a λ anomaly (which gives the transition its name), i.e. an approximately logarithmic singularity. The typical excitation spectrum and the hydrodynamic behavior as described by the two-ﬂuid model are compatible only with an interacting Bose system (Sect. 4.7.1). Another Bose gas, which is more ideal than helium and in which one can likewise expect a Bose–Einstein condensation – which has been extensively searched for experimentally – is atomic hydrogen in a strong magnetic ﬁeld (the spin polarization of the hydrogen electrons prevents recombination to molecular H2 ). Because of the diﬃculty of suppressing recombination of H to H2 , over a period of many years it however proved impossible to prepare atomic hydrogen at a suﬃcient density. The development of atom traps has recently permitted remarkable progress in this area. The Bose–Einstein condensation was ﬁrst observed, 70 years after its original prediction, in a gas consisting of around 2000 spin-polarized 87 Rb atoms, which were enclosed in a quadrupole trap.17,18 The transition temperature is at 170 × 10−9 K. One might at ﬁrst raise the objection that at low temperatures the alkali atoms should form a solid; however, a metastable gaseous state can be maintained within the trap even at temperatures in the nanokelvin range. In the initial experiments, the condensed state could be kept for about ten seconds. Similar results were obtained with a gas consisting of 2 × 105

17

18

M. H. Anderson, J. R. Ensher, M. R. Matthews, C. E. Wieman, and E. A. Cornell, Science 269, 198 (1995) See also G. P. Collins, Physics Today, August 1995, 17.

4.5 The Photon Gas

197

spin-polarized 7 Li atoms.19 In this case, the condensation temperature is Tc ≈ 400 × 10−9 K. In 87 Rb, the s-wave scattering length is positive, while in 7 Li, it is negative. However, even in 7 Li, the gas phase does not collapse into a condensed phase, in any case not within the spatially inhomogeneous atom trap.19 Finally, it also proved possible to produce and maintain a condensate containing more than 108 atoms of atomic hydrogen, with a transition temperature of about 50 µK, for up to 5 seconds.20

4.5 The Photon Gas 4.5.1 Properties of Photons We next want to determine the thermal properties of the radiation ﬁeld. To start with, we list some of the characteristic properties of photons. (i) Photons obey the dispersion relation εp = c|p| = ck and are bosons with a spin s = 1. Since they are completely relativistic particles (m = 0, v = c), their spins have only two possible orientations, i.e. parallel or antiparallel to p, corresponding to right-hand or left-hand circularly polarized light (0 and π are the only angles which are Lorentz invariant). The degeneracy factor for photons is therefore g = 2. (ii) The mutual interactions of photons are practically zero, as one can see from the following argument: to lowest order, the interaction consists of the scattering of two photons γ1 and γ2 into the ﬁnal states γ3 and γ4 ; see Fig. 4.21a. In this process, for example photon γ1 decays into a virtual electron-positron pair, photon γ2 is absorbed by the positron, the electron emits photon γ3 and recombines with the positron to give photon γ4 . The scattering cross-section for this process is extremely small, of order σ ≈ 10−50 cm2 . The mean collision time can be calculated from the scattering cross-section as follows: in the time ∆t, a photon traverses the distance c∆t. We thus consider the cylinder shown in Fig. 4.21b, whose basal area is equal to the scattering cross-section and whose length is the velocity of light × ∆t. A photon interacts within the time ∆t with all other photons which are in the volume c σ ∆t, roughly speaking. Let N be the total number of photons within the volume V (which depends on the temperature and which we still have to determine; see the end of Sect. 4.5.4). Then a photon interacts with c σ N/V particles per unit time. Thus the mean collision time (time between two collisions on average) τ is determined by τ= 19

20

(V /N ) sec V = 1040 3 . cσ cm N

C. C. Bradley, C. A. Sackett, J. J. Tollett, and R. G. Hulet, Phys. Rev. Lett. 75, 1687 (1995) D. Kleppner, Th. Greytak et al., Phys. Rev. Lett. 81, 3811 (1998)

198

4. Ideal Quantum Gases

Fig. 4.21. (a) Photon-photon scattering (dashed lines: photons; solid lines: electron and positron). (b) Scattering cross-section and mean collision time

The value of the mean collision time is approximately τ ≈ 1031 sec at room temperature and τ ≈ 1018 sec at the temperature of the Sun’s interior (107 K). Even at the temperature in the center of the Sun, the interaction of the photons it negligible. In comparison, the age of the Universe is ∼ 1017 sec. Photons do indeed constitute an ideal quantum gas. The interaction with the surrounding matter is crucial in order to establish equilibrium within the radiation ﬁeld. The establishment of equilibrium in the photon gas is brought about by absorption and emission of photons by matter. In the following, we will investigate the radiation ﬁeld within a cavity of volume V and temperature T , and without loss of generality of our considerations, we take the quantization volume to be cubical in shape (the shape is irrelevant for short wavelengths, and the long waves have a low statistical weight). (iii) The number of photons is not conserved. Photons are emitted and absorbed by the material of the cavity walls. From the quantum-ﬁeld description of photons it follows that each wavenumber and polarization direction corresponds to a harmonic oscillator. The Hamiltonian thus has the form H= εp n ˆ p,λ ≡ εp a†p,λ ap,λ , p = 0 , (4.5.1) p,λ

p,λ

where n ˆ p,λ = a†p,λ ap,λ is the occupation number operator for the momen-

tum p and the direction of polarization λ; also, a†p,λ , ap,λ are the creation and annihilation operators for a photon in the state p, λ. We note that in the Hamiltonian of the radiation ﬁeld, there is no zero-point energy, which is automatically accomplished in quantum ﬁeld theory by deﬁning the Hamiltonian in terms of normal-ordered products.21 21

C. Itzykson, J.-B. Zuber, Quantum Field Theory, McGraw-Hill; see also QM II.

4.5 The Photon Gas

199

4.5.2 The Canonical Partition Function The canonical partition function is given by (np,λ = 0, 1, 2, . . .): Z = Tr e−βH =

⎡ e−β

P p

εp np,λ

=⎣

p =0

{np,λ }

⎤2 1 ⎦ . 1 − e−βεp

(4.5.2)

Here, there is no condition on the number of photons, since it is not ﬁxed. In (4.5.2), the power 2 enters due to the two possible polarizations λ. With this expression, we ﬁnd for the free energy F (T, V ) = −kT log Z = 2kT

log 1 − e−εp /kT

p =0

=

2V β

d3 p V (kT )4 log(1 − e−βεp ) = 2 3 (2π) π (c)3

∞

dx x2 log(1 − e−x ) .

0

(4.5.3) The sum has been converted to an integral according to (4.1.14a). For the integral in (4.5.3), we ﬁnd after integration by parts ∞ dx x log(1 − e 2

−x

1 )=− 3

0

∞ 0

∞ dx x3 1 π4 = −2 , ≡ −2ζ(4) = − ex − 1 n4 45 n=1

where ζ(n) is Riemann’s ζ-function (Eqs. (D.2) and (D.3)), so that for F , we have ﬁnally F (T, V ) = −

V (kT )4 π 2 4σ = − V T4 (c)3 45 3c

(4.5.4)

with the Stefan–Boltzmann constant σ≡

π2 k4 = 5.67 × 10−8 J sec−1 m−2 K−4 . 603 c2

From (4.5.4), we obtain the entropy:

∂F 16σ V T3 , S=− = ∂T V 3c

(4.5.5)

(4.5.6a)

the internal energy (caloric equation of state) E = F + TS =

4σ V T4 , c

and the pressure (thermal equation of state)

(4.5.6b)

200

4. Ideal Quantum Gases

P =−

∂F ∂V

= T

4σ 4 T , 3c

(4.5.6c)

and ﬁnally the heat capacity

∂S 16σ CV = T V T3 . = ∂T V c

(4.5.7)

Because of the relativistic dispersion, for photons E = 3P V holds instead of 32 P V . Eq. (4.5.6b) is called the Stefan–Boltzmann law: the internal energy of the radiation ﬁeld increases as the fourth power of the temperature. The radiation pressure (4.5.6c) is very low, except at extremely high temperatures. At 105 K, the temperature produced by the a nuclear explosion, it is P = 0.25 bar, and at 107 K, the Sun’s central temperature, it is P = 25 × 106 bar. 4.5.3 Planck’s Radiation Law We now wish to discuss some of the characteristics of the radiation ﬁeld. The average occupation number of the state (p, λ) is given by np,λ =

1 eεp /kT

(4.5.8a)

−1

with εp = ωp = cp, since ∞

ˆ p,λ Tr e−βH n np,λ ≡ = Tr e−βH

np,λ =0 ∞

np,λ e−np,λ εp /kT e−np,λ εp /kT

np,λ =0

can be evaluated analogously to Eq. (4.1.9). The average occupation number (4.5.8a) corresponds to that of atomic or molecular free bosons, Eq. (4.1.9), with µ = 0. The number of occupied states in a diﬀerential element d3 p within a ﬁxed volume is therefore (see (4.1.14a)): np,λ

2V d3 p , (2π)3

(4.5.8b)

and in the interval [p, p + dp], it is np,λ

V p2 dp . π 2 3

(4.5.8c)

4.5 The Photon Gas

201

It follows from this that the number of occupied states in the interval [ω, ω + dω] is equal to V ω 2 dω . 2 3 ω/kT π c e −1

(4.5.8d)

The spectral energy density u(ω) is deﬁned as the energy per unit volume and frequency, i.e. as the product of (4.5.8d) with ω/V : u(ω) =

ω3 . π 2 c3 eω/kT − 1

(4.5.9)

This is the famous Planck radiation law (1900), which initiated the development of quantum mechanics. We now want to discuss these results in detail. The occupation number (4.5.8a) for photons diverges for p → 0 as 1/p (see Fig. 4.22), since the energy of the photons goes to zero when p → 0. Because the density of states in three dimensions is proportional to ω 2 , this divergence is irrelevant to the energy content of the radiation ﬁeld. The spectral energy density is shown in Fig 4.22.

Fig. 4.22. The photon number as a function of ω/kT (dot-dashed curve). The spectral energy density as a function of ω/kT (solid curve).

As a function of ω, it shows a maximum at ωmax = 2.82 kT ,

(4.5.10)

i.e. around three times the thermal energy. The maximum shifts proportionally to the temperature. Equation (4.5.10), Wien’s displacement law (1893), played an important role in the historical development of the theory of the radiation ﬁeld, leading to the discovery of Planck’s quantum of action. In Fig. 4.23, we show u(ω, T ) for diﬀerent temperatures T .

202

4. Ideal Quantum Gases

Fig. 4.23. Planck’s law for three temperatures, T1 < T2 < T3

We now consider the limiting cases of Planck’s radiation law: (i) ω kT : for low frequencies, we ﬁnd using (4.5.9) that u(ω) =

kT ω 2 ; π 2 c3

(4.5.11)

the Rayleigh–Jeans radiation law. This is the classical low-energy limit. This result of classical physics represented one of the principal problems in the theory of the radiation ﬁeld. Aside from the fact that it agreed with experiment only for very low frequencies, it was also fundamentally unacceptable: for according to (4.5.11), in the high-frequency limit ω → ∞, it leads to a divergence in u(ω), the so called ultraviolet catastrophe.This would in turn ∞ imply an inﬁnite energy content of the cavity radiation, 0 dω u(ω) = ∞. (ii) ω kT : In the high-frequency limit, we ﬁnd from (4.5.9) that u(ω) =

ω 3 −ω/kT e . π 2 c3

(4.5.12)

The energy density decreases exponentially with increasing frequency. This empirically derived relation is known as Wien’s law. In his ﬁrst derivation, Planck farsightedly obtained (4.5.9) by interpolating the corresponding entropies between equations (4.5.11) and (4.5.12). Often, the energy density is expressed in terms of the wavelength λ: starting , we obtain dω = − 2πc dλ. Therefore, the energy per unit volume from ω = ck = 2πc λ λ2 in the interval [λ, λ + dλ] is given by „ «˛ ˛ 16π 2 c dλ 2πc ˛˛ dω ˛˛ dEλ ” , “ 2πc dλ = (4.5.13) =u ω= ˛ dλ ˛ V λ λ5 e kT λ − 1 where we have inserted (4.5.9). The energy density as a function of the wavelength dEλ has its maximum at the value λmax , determined by dλ 2πc = 4.965 . kT λmax

(4.5.14)

4.5 The Photon Gas

203

We will now calculate the radiation which emerges from an opening in the cavity at the temperature T . To do this, we ﬁrst note that the radiation within the cavity is completely isotropic. The emitted thermal radiation at a frequency ω into a solid angle dΩ is therefore u(ω) dΩ 4π . The radiation energy which emerges per unit time onto a unit surface is

I(ω, T ) =

1 4π

dΩ c u(ω) cos ϑ =

1 4π

2π

1 dϕ

0

dη η c u(ω) =

c u(ω) . (4.5.15) 4

0

The integration over the solid angle dΩ extends over only one hemisphere (see Fig. 4.24). The total radiated power per unit surface (the energy ﬂux) is then IE (T ) = dω I(ω, T ) = σT 4 , (4.5.16) where again the Stefan–Boltzmann constant σ from Eq. (4.5.5) enters the expression.

Fig. 4.24. The radiation emission per unit surface area from a cavity radiator (black body)

A body which completely absorbs all the radiation falling upon it is called a black body. A small opening in the wall of a cavity whose walls are good absorbers is the ideal realization of a black body. The emission from such an opening calculated above is thus the radiation emitted by a black body. As an approximation, Eqns. (4.5.15,16) are also used to describe the radiation from celestial bodies. Remark: The Universe is pervaded by the so called cosmic background radiation discovered by Penzias and Wilson, which corresponds according to Planck’s law to a temperature of 2.73 K. It is a remainder from the earliest times of the Universe, around 300,000 years after the Big Bang, when the temperature of the cosmos had

204

4. Ideal Quantum Gases

already cooled to about 3000 K. Previous to this time, the radiation was in thermal equilibrium with the matter. At temperatures of 3000 K and below, the electrons bond to atomic nuclei to form atoms, so that the cosmos became transparent to this radiation and it was practically decoupled from the matter in the Universe. The expansion of the Universe by a factor of about one thousand then led to a corresponding increase of all wavelengths due to the red shift, and thus to a Planck distribution at an eﬀective temperature of 2.73 K. ∗

4.5.4 Supplemental Remarks

Let us now interpret the properties of the photon gas in a physical sense and compare it with other gases. The mean photon number is given by N =2

p

1 ecp/kT 3

V (kT ) = 2 3 3 π c

∞ 0

V = 2 3 π c −1

∞ 0

dω ω 2 eω/kT − 1

2

dx x 2ζ(3) = V ex − 1 π2

where the value p = 0 is excluded in 3

kT . N = 0.244 V c

kT c

p.

3 ,

Inserting ζ(3), we obtain (4.5.17)

Combining this with (4.5.6c) and (4.5.6a) and inserting approximate numerical values shows a formal similarity to the classical ideal gas: P V = 0.9 N kT

(4.5.18)

S = 3.6 N k ,

(4.5.19)

where N is however always given by (4.5.17) and does not have a ﬁxed value. The pressure per particle is of about the same order of magnitude as in the classical ideal gas. The thermal wavelength of the photon gas is found to be λT =

0.510 2π 2πc = [cm] . = kmax 2.82 kT T [K]

(4.5.20)

With the numerical factor 0.510, λT is obtained in units of cm. Inserting into (4.5.17), we ﬁnd

3 2π V V N = 0.244 = 2.70 3 . (4.5.21) 3 2.82 λT λT For the classical ideal gas,

V N λ3T

1; in contrast, the average spacing of the

photons (V /N )1/3 is, from (4.5.21), of the order of magnitude of λT , and therefore, they must be treated quantum mechanically.

4.5 The Photon Gas

205

At room temperature, i.e. T = 300 K, λT = 1.7 × 10−3 cm and the density is = 5.5 × 108 cm−3 . At the temperature of the interior of the Sun, i.e. T ≈ 22 −3 . In 10 K, λT = 5.1 × 10−8 cm and the density is N V = 2.0 × 10 cm −4 comparison, the wavelength of visible light is in the range λ = 10 cm. Note: If the photon had a ﬁnite rest mass m, then we would have g = 3. In that case, a factor of 32 would enter the Stefan–Boltzmann law. The experimentally demonstrated validity of the Stefan–Boltzmann law implies that either m = 0, or that the longitudinal photons do not couple to matter. N V 7

The chemical potential: The chemical potential of the photon gas can be computed from the Gibbs–Duhem relation E = T S − P V + µN , since we are dealing with a homogeneous system: µ=

1 1 (E − T S + P V ) = N N

„ « 16 4 σV T 3 4− + ≡0. 3 3 3c

(4.5.22)

The chemical potential of the photon gas is identical to 0 for all temperatures, because the number of photons is not ﬁxed, but rather adjusts itself to the temperature and the volume. Photons are absorbed and emitted by the surrounding matter, the walls of the cavity. In general, the chemical potential of particles and quasiparticles such as phonons, whose particle numbers are not subject to a conservation law, is zero. For example we consider the free energy of a ﬁctitious constant number of photons (phonons etc.), F (T, V, NPh ). since the number of photons (phonons) is“ not ﬁxed, it will adjust itself in such a way that the free energy is ” minimized,

∂F ∂NPh

= 0. This is however just the expression for the chemical T,V

potential, which therefore vanishes: 0. We could have just as well started from “ µ =” ∂S = − Tµ = 0. the maximization of the entropy, ∂N Ph

∗

E,V

4.5.5 Fluctuations in the Particle Number of Fermions and Bosons

Now that we have become acquainted with the statistical properties of various quantum gases, that is of fermions and bosons (including photons, whose particle-number distribution is characterized by µ = 0), we now want to investigate the ﬂuctuations of their particle numbers. For this purpose, we begin with the grand potential Φ = −β −1 log

e−β

P p

np (εp −µ)

.

(4.5.23)

{np }

Taking the derivative of Φ with respect to εq yields the mean value of nq : ∂Φ = ∂εq

nq e−β

{np }

{np }

e−β

P p

P p

np (εp −µ)

np (εp −µ)

= nq .

(4.5.24)

206

4. Ideal Quantum Gases

The second derivative of Φ yields the mean square deviation 0 / ∂2Φ 2 = −β n2q − nq ≡ −β(∆nq )2 . 2 ∂εq Thus, using

ex ex ∓1

=1±

(∆nq )2 = −β −1

1 ex ∓1 ,

(4.5.25)

we obtain

eβ(εq −µ) ∂nq = 2 = nq 1 ± nq . β(ε −µ) ∂εq e q ∓1

(4.5.26)

For fermions, the mean square deviation is always small. In the range of occupied states, where nq = 1, ∆nq is zero; and in the region of small nq , 1/2 ∆nq ≈ nq . Remark: For bosons, the ﬂuctuations can become very large. In the case of large occupation numbers, we have ∆nq ∼ n(q) and the relative deviation approaches one. This is a consequence of the tendency of bosons to cluster in the same state. These strong ﬂuctuations are also found in a spatial sense. If N bosons are enclosed in a volume of L3 , then the mean number of bosons in a subvolume a3 is given by n ¯ = N a3 /L3 . In the case that a λ, where λ is the extent of the wavefunctions of the bosons, one ﬁnds the mean square deviation of the particle number (∆Na3 )2 within the subvolume to be22 (∆Na3 )2 = n ¯ (¯ n + 1) . For comparison, we recall the quite diﬀerent behavior of classical particles, which obey a Poisson distribution (see Sect. 1.5.1). The probability of ﬁnding n particles in the subvolume a3 for a/L 1 and N → ∞ is then Pn = e−¯n

n ¯n n!

with n ¯ = N a3 /L3 , from which it follows that X (∆n)2 = n2 − n ¯2 = Pn n2 − n ¯2 = n ¯. n

The deviations of the counting rates of bosons from the Poisson law have been experimentally veriﬁed using intense photon beams.23

4.6 Phonons in Solids 4.6.1 The Harmonic Hamiltonian We recall the mechanics of a linear chain consisting of N particles of mass m which are coupled to their nearest neighbors by springs of force constant f . In the harmonic approximation, its Hamilton function takes on the form 22

23

A detailed discussion of the tendency of bosons to cluster in regions where their wavefunctions overlap may be found in E. M. Henley and W. Thirring, Elementary Quantum Field Theory, McGraw Hill, New York 1962, p. 52ﬀ. R. Hanbury Brown and R. Q. Twiss, Nature 177, 27 (1956).

4.6 Phonons in Solids

H = W0 +

n

m 2 f u˙ + (un − un−1 )2 2 n 2

.

207

(4.6.1)

One obtains expression (4.6.1) by starting from the Hamilton function of N particles whose positions are denoted by xn . Their equilibrium positions are x0n , where for an inﬁnite chain or a ﬁnite chain with periodic boundary conditions, the equilibrium positions have exact translational invariance and the distance between neighboring equilibrium positions is given by the lattice constant a = x0n+1 − x0n . One then introduces the displacements from the equilibrium positions, un = xn − x0n , and expands in terms of the un . The quantity W0 is given by the value of the overall potential energy W ({xn }) of the chain in the equilibrium positions. Applying the canonical transformation m −ikan 1 ikan un = √ e Qk , mu˙ n = e Pk , (4.6.2) N Nm k k we can transform H into a sum of uncoupled harmonic oscillators H = W0 +

1 k

2

(Pk P−k + ωk2 Qk Q−k ) ,

where the frequencies are related to the wavenumber via f ka sin . ωk = 2 m 2

(4.6.1 )

(4.6.3)

The Qk are called normal coordinates and the Pk normal momenta. The Qk and Pk are conjugate variables, which we will take to be quantum-mechanical operators in what follows. In the quantum representation, commutation rules hold: [un , mu˙ n ] = iδnn ,

[un , un ] = [mu˙ n , mu˙ n ] = 0

which in turn imply that [Qk , Pk ] = iδkk ,

[Qk , Qk ] = [Pk , Pk ] = 0 ;

furthermore, we have Q†k = Q−k and Pk† = P−k . Finally, by introducing the creation and annihilation operators ωk † Qk = a−k − a†k , a + a−k , Pk = −i (4.6.4) 2ωk k 2 we obtain H = W0 +

k

1 ˆk + ωk n 2

(4.6.1 )

208

4. Ideal Quantum Gases

with the occupation (number) operator n ˆ k = a†k ak

(4.6.5)

and [ak , a†k ] = δkk , [ak , ak ] = [a†k , a†k ] = 0. In this form, we can readily generalize the Hamiltonian to three dimensions. In a three-dimensional crystal with one atom per unit cell, there are three lattice vibrations for each wavenumber, one longitudinal (l) and two transverse (t1 , t2 ) (see Fig. 4.25). If the unit cell contains s atoms, there are 3s lattice vibrational modes. These are composed of the three acoustic modes, whose frequencies vanish at k = 0, and the 3(s − 1) optical phonon modes, whose frequencies are ﬁnite at k = 0.24

Fig. 4.25. The phonon frequencies in a crystal with one atom per unit cell

We shall limit ourselves to the simple case of a single atom per unit cell, i.e. to Bravais-lattice crystals. Then, according to our above considerations, the Hamiltonian is given by:

1 H = W0 (V ) + . (4.6.6) ˆ k,λ + ωk,λ n 2 k,λ

Here, we have characterized the lattice vibrations in terms of their wavevector k and their polarization λ. The associated frequency is ωk,λ and the operator for the occupation number is n ˆ k,λ . The potential energy W0 (V ) in the equilibrium lattice locations of the crystal depends on its lattice constant, or, equivalently when the number of particles is ﬁxed, on the volume. For brevity, we combine the wavevector and the polarization into the form k ≡ (k, λ). In a lattice with a total of N atoms, there are 3N vibrational degrees of freedom.

24

See e.g. J. M. Ziman, Principles of the Theory of Solids, 2nd edition, Cambridge University Press, 1972.

4.6 Phonons in Solids

209

4.6.2 Thermodynamic Properties In analogy to the calculation for photons, we ﬁnd for the free energy ωk + kT log 1 − e−ωk /kT . (4.6.7) F = −kT log Z = W0 (V ) + 2 k

The internal energy is found from

∂ F E = −T 2 , ∂T T V

(4.6.8)

thus E = W0 (V ) +

ωk k

2

+

k

ωk

1 . eωk /kT − 1

(4.6.8 )

It is again expedient for the case of phonons to introduce the normalized density of states 1 δ(ω − ωk ) , (4.6.9) g(ω) = 3N k

where the prefactor has been chosen so that ∞ dω g(ω) = 1 .

(4.6.10)

0

Using the density of states, the internal energy can be written in the form: ∞ E = W0 (V ) + E0 + 3N

dω g(ω) 0

ω eω/kT

−1

,

(4.6.11)

where we have used E0 = k ωk /2 to denote the zero-point energy of the phonons. For the thermodynamic quantities, the precise dependence of the phonon frequencies on wavenumber is not important, but instead only their distribution, i.e. the density of states. Now, in order to determine the thermodynamic quantities such as the internal energy, we ﬁrst have to calculate the density of states, g(ω). For small k, the frequency of the longitudinal phonons is ωk,l = cl k, and that of the transverse phonons is ωk,t = ct k, the latter doubly degenerate; here, cl and ct are the longitudinal and transverse velocities of sound. Inserting these expressions into (4.6.9), we ﬁnd

1 V 1 V ω2 2 2 g(ω) = dk k [δ(ω − cl k) + 2δ(ω − ct k)] = + 3 . 3N 2π 2 N 6π 2 c3l ct (4.6.12)

210

4. Ideal Quantum Gases

Equation (4.6.12) applies only to low frequencies, i.e. in the range where the phonon dispersion relation is in fact linear. In this frequency range, the density of states is proportional to ω 2 , as was also the case for photons. Using (4.6.12), we can now compute the thermodynamic quantities for low temperatures, since in this temperature range. only low-frequency phonons are thermally excited. In the high-temperature limit, as we shall see, the detailed shape of the phonon spectrum is unimportant; instead, only the total number of vibrational modes is relevant. We can therefore treat this case immediately, also (Eq. 4.6.14). At low temperatures only low frequencies contribute, since frequencies ω kT / are suppressed by the exponential function in the integral (4.6.11). Thus the low-frequency result (4.6.12) for g(ω) can be used. Corresponding to the calculation for photons, we ﬁnd

V π2 k4 1 2 E = W0 (V ) + E0 + + 3 T4 . (4.6.13) 303 c3l ct At high temperatures, i.e. temperatures which are much higher than ωmax /k, where ωmax is the maximum frequency of the phonons, we ﬁnd for all frequen −1 cies at which g(ω) is nonvanishing that eω/kT − 1 ≈ kT ω , and therefore, it follows from (4.6.11) and (4.6.10) that E = W0 (V ) + E0 + 3N kT .

(4.6.14)

Taking the derivative with respect to temperature, we obtain from (4.6.13) and (4.6.14) in the low-temperature limit CV ∼ T 3 ;

(4.6.15)

this is Debye’s law. In the high-temperature limit, we have CV ≈ 3N k ,

(4.6.16)

the law of Dulong–Petit. At low temperatures, the speciﬁc heat is proportional to T 3 , while at high temperatures, it is equal to the number of degrees of freedom times the Boltzmann constant. In order to determine the speciﬁc heat over the whole range of temperatures, we require the normalized density of states g(ω) for the whole frequency range. The typical shape of g(ω) for a Bravais crystal24 is shown in Fig. 4.26. At small values of ω, the ω 2 behavior is clearly visible. Above the maximum frequency, g(ω) becomes zero. In intermediate regions, the density of states exhibits characteristic structures, so called van Hove singularities24 which result from the maxima, minima, and saddle points of the phonon dispersion relation; their typical form is shown in Fig. 4.27. An interpolation formula which is adequate for many purposes can be obtained by approximating the density of states using the Debye approximation:

4.6 Phonons in Solids

Fig. 4.26. The phonon density of states g(ω). Solid curve: a realistic density of states; dashed curve: the Debye approximation

gD (ω) =

Fig. 4.27. A phonon dispersion relation with maxima, minima, and saddle points, which express themselves in the density of states as van Hove singularities

3ω 2 3 Θ(ωD − ω) , ωD

with 1 1 V = 3 ωD 18π 2 N

1 2 + 3 3 cl ct

211

(4.6.17a)

.

(4.6.17b)

With the aid of (4.6.17a), the low-frequency expression (4.6.12) is extended to cover the whole range of frequencies and is cut oﬀ at the so called Debye frequency ωD , which is chosen in such a way that (4.6.10) is obeyed. The Debye approximation is also shown in Fig. 4.26. Inserting (4.6.17a) into (4.6.11), we obtain

ωD E = W0 (V ) + E0 + 3N k T D (4.6.18) kT with 3 D(x) = 3 x

x 0

dy y 3 . ey − 1

(4.6.19)

Taking the temperature derivative of (4.6.18), we obtain an expression for the speciﬁc heat, which interpolates between the two limiting cases of the Debye and the Dulong-Petit values (see Fig. 4.28). ∗ 4.6.3 Anharmonic Eﬀects and the Mie–Gr¨ uneisen Equation of State

So far, we have treated only the harmonic approximation. In fact, the Hamiltonian for phonons in a crystal also contains anharmonic terms, e.g.

212

4. Ideal Quantum Gases

Fig. 4.28. The heat capacity of a monatomic insulator. At low temperatures, CV ∼ T 3 ; at high temperatures, it is constant

Hint =

c(k1 , k2 )Qk1 Qk2 Q−k1 −k2

k1 ,k2

with coeﬃcients c(k1 , k2 ). Terms of this type and higher powers arise from the expansion of the interaction potential in terms of the displacements of the lattice components. These nonlinear terms are responsible for (i) the thermal expansion of crystals, (ii) the occurrence of a linear term in the speciﬁc heat at high T , (iii) phonon damping, and (iv) a ﬁnite thermal conductivity. These terms are also decisive for structural phase transitions. A systematic treatment of these phenomena requires perturbation-theory methods. The anharmonic terms have the eﬀect that the frequencies ωk depend on the lattice constants, i.e. on the volume V of the crystal. This eﬀect of the anharmonicity can be taken into account approximately by introducing a minor extension to the harmonic theory of the preceding subsection for the derivation of the equation of state. We take the volume derivative of the free energy F . In addition to the potential energy W0 of the equilibrium conﬁguration, also ωk , owing to the anharmonicities, depends on the volume V ; therefore, we ﬁnd for the pressure

∂F 1 ∂ log ωk 1 ∂W0 P =− − + ω /kT . (4.6.20) =− ωk ∂V T ∂V 2 e k ∂V −1 k

For simplicity, we assume that the logarithmic derivative of ωk with respect to the volume is the same for all wavenumbers (the Gr¨ uneisen assumption): ∂ log ωk 1 ∂ log ωk 1 = = −γ . ∂V V ∂ log V V

(4.6.21)

The material constant γ which occurs here is called the Gr¨ uneisen constant. The negative sign indicates that the frequencies become smaller on expansion of the lattice. We now insert (4.6.21) into (4.6.20) and compare with (4.6.8 ), thus obtaining, with EPh = E − W0 , the Mie–Gr¨ uneisen equation of state: EPh ∂W0 +γ . (4.6.22) ∂V V This formula applies to insulating crystals in which there are no electronic excitations and the thermal behavior is determined solely by the phonons. P =−

4.7 Phonons und Rotons in He II

213

From the Mie–Gr¨ uneisen equation of state, the various thermodynamic derivatives can be obtained, such as the thermal pressure coeﬃcient (3.2.5)

∂P β= = γCV (T )/V (4.6.23) ∂T V and the linear expansion coeﬃcient (cf. Appendix I., Table I.3)

∂V 1 αl = , 3V ∂T P which, owing to the form αl =

∂P ∂T V

=−

∂V 1 ∂V ∂T

P

∂P

T

≡

( ∂V ∂T )P κT V

1 βκT . 3

(4.6.24)

, can also be given in

(4.6.25)

In this last relation, at low temperatures the compressibility can be replaced by κT (0) = −

1 V

∂V ∂P

= T =0

−1

∂ 2 W0 V . ∂V 2

(4.6.26)

At low temperatures, from Eqns. (4.6.23) and (4.6.25), the coeﬃcient of thermal expansion and the thermal pressure coeﬃcient of an insulator, as well as the speciﬁc heat, are proportional to the third power of the temperature: α ∝ β ∝ T3 . As a result of the thermodynamic relationship of the speciﬁc heats (3.2.24), we ﬁnd CP −CV ∝ T 7 . Therefore, at temperatures below the Debye temperature, the isobaric and the isochoral speciﬁc heats are practically equal. In analogy to the phonons, one can determine the thermodynamic properties of other quasiparticles. Magnons in antiferromagnetic materials likewise have a linear dispersion relation at small values of k and therefore, their contribution to the speciﬁc heat is also proportional to T 3 . Magnons in ferromagnets have a quadratic dispersion relation ∼ k 2 , leading to a speciﬁc heat ∼ T 3/2 .

4.7 Phonons und Rotons in He II 4.7.1 The Excitations (Quasiparticles) of He II At the conclusion of our treatment of the Bose–Einstein condensation in 4.4, we discussed the phase diagram of 4 He. In the He II phase below Tλ = 2.18 K, 4 He undergoes a condensation. States with the wavenumber 0 are occupied

214

4. Ideal Quantum Gases

macroscopically. In the language of second quantization, this means that the expectation value of the ﬁeld operator ψ(x) is ﬁnite. The order parameter here is ψ(x).25 The excitation spectrum is then quite diﬀerent from that of a system of free bosons. We shall not enter into the quantum-mechanical theory here, but instead use the experimental results as starting point. At low temperatures, only the lowest excitations are relevant. In Fig. 4.29, we show the excitations as determined by neutron scattering.

Fig. 4.29. The quasiparticle excitations in superﬂuid 4 He: phonons and rotons after Henshaw and Woods.26

The excitation spectrum exhibits the following characteristics: for small values of p, the excitation energy depends linearly on the momentum εp = cp .

(4.7.1a)

In this region, the excitations are called phonons, whose velocity of sound is c = 238 m/sec. A second characteristic of the excitation spectrum is a minimum at p0 = 1.91 ˚ A−1. In this range, the excitations are called rotons, and they can be represented by εp = ∆ + 25

(|p| − p0 )2 , 2µ

We have a0 |φ0 (N ) = a†0

26

|φ0 (N ) =

√ √

N |φ0 (N − 1) ≈

(4.7.1b) √

N |φ0 (N ) √ N + 1 |φ0 (N + 1) ≈ N |φ0 (N ) ,

since due to the macroscopic occupation of the ground state, N 1. See for example QM II, Sect. 3.2.2. D. G. Henshaw and A. D. Woods, Phys. Rev. 121, 1266 (1961)

4.7 Phonons und Rotons in He II

215

with an eﬀective mass µ = 0.16 mHe and an energy gap ∆/k = 8.6 K. These properties of the dispersion relations will make themselves apparent in the thermodynamic properties. 4.7.2 Thermal Properties At low temperatures, the number of excitations is small, and their interactions can be neglected. Since the 4 He atoms are bosons, the quasiparticles in this system are also bosons.27 We emphasize that the quasiparticles in Eqns. (4.7.1a) and (4.7.1b) are collective density excitations, which have nothing to do with the motions of individual helium atoms. As a result of the Bose character and due to the fact that the number of quasiparticles is not conserved, i.e. the chemical potential is zero, we ﬁnd for the mean occupation number −1 n(εp ) = eβεp − 1 . (4.7.2) From this, the free energy follows: kT V d3 p log 1 − e−βεp , F (T, V ) = (2π)3 and for the average number of quasiparticles V d3 p n(εp ) NQP (T, V ) = (2π)3 and the internal energy V d3 p εp n(εp ) . E(T, V ) = (2π)3

(4.7.3a)

(4.7.3b)

(4.7.3c)

At low temperatures, only the phonons and rotons contribute in (4.7.3a) through (4.7.3c), since only they are thermally excited. The contribution of the phonons in this limit is given by Fph = −

π 2 V (kT )4 , 90(c)3

or Eph =

π 2 V (kT )4 . 30(c)3

(4.7.4a,b)

From this, we ﬁnd for the heat capacity at constant volume: CV = 27

2π 2 V k 4 T 3 . 15(c)3

(4.7.4c)

In contrast, in interacting fermion systems there can be both Fermi and Bose quasiparticles. The particle number of bosonic quasiparticles is in general not ﬁxed. Additional quasiparticles can be created; since the changes in the angular momentum of every quantum-mechanical system must be integral, these excitations must have integral spins.

216

4. Ideal Quantum Gases

Due to the gap in the roton energy (4.7.1b), the roton occupation number at low temperatures T ≤ 2 K can be approximated by n(εp ) ≈ e−βεp , and we ﬁnd for the average number of rotons V Nr ≈ (2π)3 =

3

d pe

V e−β∆ 2π 2 3

∞

−βεp

V = 2 3 2π

dp p2 e−β(p−p0 )

∞

dp p2 e−βεp

0 2

/2µ

0

V ≈ 2 3 e−β∆ p20 2π

∞

dp e−β(p−p0 )

−∞

2

/2µ

=

1/2 −β∆ V p20 2πµkT e . 2 3 2π (4.7.5a)

The contribution of the rotons to the internal energy is

V ∂ kT 3 −βεp N Nr , Er ≈ d p ε e = − = ∆ + p r (2π)3 ∂β 2 from which we obtain the speciﬁc heat )

2 * ∆ ∆ 3 + + Nr , Cr = k 4 kT kT

(4.7.5b)

(4.7.5c)

where from (4.7.5a), Nr goes exponentially to zero for T → 0. In Fig. 4.30, the speciﬁc heat is drawn in a log-log plot as a function of the temperature. The straight line follows the T 3 law from Eq. (4.7.4c). Above 0.6 K, the roton contribution (4.7.5c) becomes apparent.

Fig. 4.30. The speciﬁc heat of helium II under the saturated vapor pressure (Wiebes, NielsHakkenberg and Kramers).

4.7 Phonons und Rotons in He II ∗

217

4.7.3 Superﬂuidity and the Two-Fluid Model

The condensation of helium and the resulting quasiparticle dispersion relation (Eq. 4.7.1a,b, Fig. 4.29) have important consequences for the dynamic behavior of 4 He in its He II phase. Superﬂuidity and its description in terms of the two-ﬂuid model are among them. To see this, we consider the ﬂow of helium through a tube in two diﬀerent inertial frames. In frame K, the tube is at rest and the liquid is ﬂowing at the velocity −v. In frame K0 , we suppose the helium to be at rest, while the tube moves with the velocity v (see Fig. 4.31).

Fig. 4.31. Superﬂuid helium in the rest frame of the tube, K, and in the rest frame of the liquid, K0

The total energies (E, E0 ) and the total momenta (P, P0 ) of the liquid in the two frames (K,K0 ) are related by a Galilei transformation. P = P0 − M v

(4.7.6a)

E = E0 − P0 · v +

2

Mv . 2

Here, we have used the notation pi = P , pi0 = P0 , i

(4.7.6b)

i

mi = M .

(4.7.6c)

i

One can derive (4.7.6a,b) by applying the Galilei transformation for the individual particles xi = xi0 − vt

,

pi = pi0 − mv .

This gives for the total momentum X X (pi0 − mv) = P0 − M v P= pi = and for the total energy E=

”2 X X m “ pi0 X 1 2 X V (xi − xj ) = V (xi0 − xj0 ) pi + −v + 2m 2 m i i

i,j

i,j

X p2i0 M 2 X M 2 = V (xi0 − xj0 ) = E0 − P0 · v + − P0 · v + v + v . 2m 2 2 i

i,j

218

4. Ideal Quantum Gases

In an ordinary liquid, any ﬂow which might initially be present is damped by friction. Seen from the frame K0 , this means that in the liquid, excitations are created which move along with the walls of the tube, so that in the course of time more and more of the liquid is pulled along with the moving tube. Seen from the tube frame K, this process implies that the ﬂowing liquid is slowed down. The energy of the liquid must simultaneously decrease in order for such excitations to occur at all. We now need to investigate whether for the particular excitation spectrum of He II, Fig. 4.29, the ﬂowing liquid can reduce its energy by the creation of excitations. Is it energetically favorable to excite quasiparticles? We ﬁrst consider helium at the temperature T = 0, i.e. in its ground state. In the ground state, energy and momentum in the frame K0 are given by E0g

and P0 = 0 .

(4.7.7a)

It follows for these quantities in the frame K: E g = E0g +

M v2 2

and P = −M v .

(4.7.7b)

If a quasiparticle with momentum p and energy εp is created, the energy and the momentum in the frame K0 have the values E0 = E0g + εp

and P0 = p ,

(4.7.7c)

and from (4.7.6a,b) we ﬁnd for the energy in the frame K: E = E0g + εp − p · v +

M v2 2

and P = p − M v .

(4.7.7d)

The excitation energy in K (the tube frame) is thus ∆E = εp − p · v .

(4.7.8)

∆E is the energy change of the liquid due to the appearance of an excitation in frame K. Only when ∆E < 0 does the ﬂowing liquid reduce its energy. Since ε − pv is a minimum when p is parallel to v, the inequality v>

ε p

(4.7.9a)

must be obeyed for an excitation to occur. From (4.7.9a) we ﬁnd the critical velocity (Fig. 4.32)

ε vc = ≈ 60 m/sec . (4.7.9b) p min If the ﬂow velocity is smaller than vc , no quasiparticles will be excited and the liquid ﬂows unimpeded and loss-free through the tube. This phenomenon

4.7 Phonons und Rotons in He II

219

Fig. 4.32. Quasiparticles and the critical velocity

is called superﬂuidity. The occurrence of a ﬁnite critical velocity is closely connected to the shape of the excitation spectrum, which has a ﬁnite group velocity at p = 0 and is everywhere greater than zero (Fig. 4.32). The value (4.7.9b) of the critical velocity is observed for the motion of ions in He II. The critical velocity for ﬂow in capillaries is much smaller than vc , since vortices occur already at lower velocities; we have not considered these excitations here. A corresponding argument holds also for the formation of additional excitations at nonzero temperatures. At ﬁnite temperatures, thermal excitations of quasiparticles are present. What eﬀect do they have? The quasiparticles will be in equilibrium with the moving tube and will have the average velocity of the frame K0 , v. The condensate, i.e. the superﬂuid component, is at rest in K0 . The quasiparticles have momentum p and an excitation energy of εp in K0 . The mean number of these quasiparticles is n(εp − p · v). (One has to apply the equilibrium distribution functions in the frame in which the quasiparticle gas is at rest! – and there, the excitation energy is εp − p · v). The momentum of the quasiparticle gas in K0 is given by V d3 p p n(εp − p · v) . (4.7.10) P0 = (2π)3 For low velocities, we can expand (4.7.10) in terms of v. Using d3 p p n(εp ) = 0 and terminating the expansion at ﬁrst order in v, we ﬁnd ∂n −V ∂n −V 1 3 d3 p p2 P0 ≈ d p p(p · v) = v , (2π)3 ∂εp (2π)3 3 ∂εp where d3 p pi pj f (|p|) = 13 δij d3 p p2 f (|p|) was used. At low T , it suﬃces to take the phonon contribution in this equation into account, i.e. P0,ph

4πV 1 =− v 5 3 (2π) 3c

∞ dε ε4

∂n . ∂ε

(4.7.11)

0

After integration by parts and replacement of 4π obtains

dε ε2 /c3 by

d3 p, one

220

4. Ideal Quantum Gases

P0,ph =

V 4 v 2 3 (2π) 3c

d3 p εp n(εp ) .

We write this result in the form P0,ph = V ρn,ph v ,

(4.7.12)

where we have deﬁned the normal ﬂuid density by ρn,ph =

4 Eph 2π 2 (kT )4 = ; 3 V c2 45 3 c5

(4.7.13)

compare (4.7.4b). In (4.7.13), the phonon contribution to ρn is evaluated. The contribution of the rotons is given by ρn,r =

p20 Nr . 3kT V

(4.7.14)

Eq. (4.7.14) follows from (4.7.10) using similar approximations as in the determination of Nr in Eq. (4.7.5a). One calls ρn = ρn,ph +ρn,r the mass density of the normal component. Only this portion of the density reaches equilibrium with the walls. Using (4.7.10) and (4.7.12), the total momentum per unit volume, P0 /V , is found to be given by P0 /V = ρn v .

(4.7.15)

We now carry out a Galilei transformation from the frame K0 , in which the condensate is at rest, to a frame in which the condensate is moving at the velocity vs . The quasiparticle gas, i.e. the normal component, has the velocity vn = v + vs in this reference frame. The momentum is found from (4.7.15) by adding ρvs due to the Galilei transformation: P/V = ρvs + ρn v . If we substitute v = vn − vs , we can write the momentum in the form P/V = ρs vs + ρn vn ,

(4.7.16)

where the superﬂuid density is deﬁned by ρ s = ρ − ρn .

(4.7.17)

Similarly, the free energy in the frame K0 can be calculated, and from it, by means of a Galilei transformation, the free energy per unit volume of the ﬂowing liquid in the frame in which the superﬂuid component is moving at vs (problem 4.23): 1 1 F (T, V, vs , vn )/V = F (T, V )/V + ρs vs2 + ρn vn2 , 2 2

(4.7.18)

where the free energy of the liquid at rest, F (T, V ) is given by (4.7.3a) and the relations which follow it.

Problems for Chapter 4

221

Fig. 4.33. The superﬂuid and the normal density ρs and ρn in He II as functions of the temperature, measured using the motion of a torsional oscillator by Andronikaschvili.

The hydrodynamic behavior of the helium in the He II phase is as would be expected if the helium consisted of two ﬂuids, a normal ﬂuid with the density ρn , which reaches equilibrium with obstacles such as the inner wall of a tube in which it is ﬂowing, and a superﬂuid with the density ρs , which ﬂows without resistance. When T → 0, ρs → ρ and ρn → 0; for T → Tλ ρs → 0 and ρn → ρ. This theoretical picture, the two–ﬂuid model of Tisza and Landau, was experimentally conﬁrmed by Andronikaschvili, among others (Fig. 4.33). It provides the theoretical basis for the fascinating macroscopic properties of superﬂuid helium.

Problems for Chapter 4 4.1 Demonstrate the validity of equations (4.3.24a) and (4.3.24b). 4.2 Show that the entropy of an ideal Bose (Fermi) gas can be formulated as follows: X“ ` ´ ` ´” −np log np ± 1 ± np log 1 ± np . S=k p

Consider this expression in the classical limit, also, as well as in the limit T → 0.

4.3 Calculate CV , CP , κT , and α for ideal Bose and Fermi gases in the limit of extreme dilution up to the order 3 . 4.4 Estimate the Fermi energies (in eV) and the Fermi temperatures (in K) for the following systems (in the free–particle approximation: εF = (a) Electrons in metal (b) Neutrons in a heavy nucleus A3 ). (c) 3 He in liquid 3 He (V /N = 46.2 ˚

2 2m

„

N V

«2/3„

6π 2 g

«2/3

):

222

4. Ideal Quantum Gases

4.5 Consider a one–dimensional electron gas (S = 1/2), consisting of N particles conﬁned to the interval (0, L). (a) What are the values of the Fermi momentum pF and the Fermi energy εF ? (b) Calculate, in analogy to Sect. 4.3, µ = µ(T, N/L). i h “ ”2 2 Result: pF = πN , µ = εF 1 + π12 kT + O(T 4 ) . L εF Give a qualitative explanation of the diﬀerent sign of the temperature dependence when compared to the three-dimensional case.

4.6 Calculate the chemical potential µ(T, N/V ) for a two-dimensional Fermi gas. ˙

¸

4.7 Determine the mean square deviation (∆N )2 = N 2 − N 2 of the number of electrons for an electron gas in the limit of zero temperature.

4.8 Calculate the isothermal compressibility (Eq. (4.3.18)) of the electron gas at low temperatures, starting from the formula (4.3.14 ) for the pressure, P = 25 εFVN + 2 π 2 (kT ) N 6 εF V

. Compare with the mean square deviation of the particle number found in problem 4.7.

4.9 Compute the free energy of the nearly degenerate Fermi gas, as well as α and CP .

4.10 Calculate for a completely relativistic Fermi gas (εp = pc) (a) the grand potential Φ (b) the thermal equation of state (c) the speciﬁc heat CV . Consider also the limiting case of very low temperatures.

4.11 p (a) Calculate the ground state energy of a relativistic electron gas,

Ep = (me c2 )2 + (pc)2 , in a white dwarf star, which contains N electrons and N/2 helium nuclei (at rest), and give the zero–point pressure for the two limiting cases me c2 2 xF v5 „ « 1 me c2 1 − ; xF 1 : P0 = x F v4 x2F

xF 1 : P0 =

xF =

pF . me c

How does the pressure depend on the radius R of the star?

(b) Derive the relation between the mass M of the star and its radius R for the two cases xF 1 and xF 1, and show that a white dwarf can have no greater mass than r „ «3/2 c 9mp 3π M0 = . 64 α3 γm2p α∼1, G = 6.7 × 10−8 dyn cm2 g−2 −24

mp = 1.7 × 10

g

Gravitational Proton

constant

mass

(c) If a star of a given mass M = 2mp N is compressed to a (ﬁnite) radius R, then its energy is reduced by the self–energy Eg of gravitation, which for a homogeneous

Problems for Chapter 4

223

mass distribution has the form Eg = −αGM 2 /R, where α is a number of the order of 1. From dR dEg dE0 + =0 dV dV dR you can determine the equilibrium radius, with dE0 = −P0 (R) 4πR2 dR as the diﬀerential of the ground–state energy.

4.12 Show that in a two–dimensional ideal Bose gas, there can be no Bose–Einstein condensation. 4.13 Prove the formulas (4.4.10) and (4.4.11) for the entropy and the speciﬁc heat of an ideal Bose gas. 4.14 Compute the internal energy of the ideal Bose gas for T < Tc (v). From the result, determine the speciﬁc heat (heat capacity) and compare it with Eq. (4.4.11).

4.15 Show for bosons with εp = aps and µ = 0 that the speciﬁc heat at low temperatures varies as T 3/s in three dimensions. In the special case of s = 2, this yields the speciﬁc heat of a ferromagnet where these bosons are spin waves. 4.16 Show that the maximum in Planck’s formula for the energy distribution u(ω) is at ωmax = 2.82 kT ; see (4.5.10). 4.17 Conﬁrm that the energy ﬂux IE (T ) which is emitted by a black body of temperature T into one hemisphere is given by (Eq. (4.5.16)), IE (T ) ≡ cE = σT 4 , starting from the energy current density 4V jE =

energy emitted cm2 sec

=

1 X p c εp np,λ . V p p,λ

df The energy ﬂux IE per unit area through a surface element of df is jE |df . |

4.18 The energy ﬂux which reaches the Earth from the Sun is equal to b = 0, 136 Joule sec−1 cm−2 (without absorption losses, for perpendicular incidence). b is called the solar constant. (a) Show that the total emission from the Sun is equal to 4 × 1026 Joule sec−1 . (b) Calculate the surface temperature of the Sun under the assumption that it radiates as a black body (T ∼ 6000 K). RS = 7 × 1010 cm, RSE = 1 AU = 1.5 × 1013 cm 4.19 Phonons in a solid: calculate the contribution of the so called optical phonons to the speciﬁc heat of a solid, taking the dispersion relation of the vibrations to be ε(k) = ωE (Einstein model). 4.20 Calculate the frequency distribution corresponding to Equation (4.6.17a) for a one- or a two–dimensional lattice. How does the speciﬁc heat behave at low temperatures in these cases? (examples of low–dimensional systems are selenium (one–dimensional chains) and graphite (layered structure)).

224

4. Ideal Quantum Gases

0 4.21 The pressure of a solid is given by P = − ∂W +γ ∂V

Eph V

(see (4.6.22)). Show, under the assumption that W0 (V ) = (V −V0 ) /2χ0 V0 for V ∼ V0 and χ0 CV T V0 , that the thermal expansion (at constant P ∼ 0) can be expressed as „ « γ 2 χ0 CV2 T γχ0 CV 1 ∂V and CP − CV = . = α≡ V ∂T V0 V0 2

4.22 Speciﬁc heat of metals: compare the contributions of phonons and electrons. Show that the linearpcontribution to the speciﬁc heat becomes predominant only at T < T ∗ = 0.14θD θD /TF . Estimate T ∗ for typical values of θD and TF .

4.23 Superﬂuid helium: show that in a coordinate frame in which the superﬂuid component is at rest, the free energy F = E − T S is given by h i 1X log 1 − e−β(εp −p·v) . Φv + ρn v 2 , where Φv = β p

Expand Φv and show also that in the system in which the superﬂuid component is moving at a velocity vs , F = Φ0 +

ρs vs2 ρn vn2 + ; 2 2

vn = v + vs .

Hint: In determining the free energy F , note that the distribution function n for the quasiparticles with energy εp is equal to n(εp − p · v).

4.24 Ideal Bose and Fermi gases in the canonical ensemble: (a) Calculate the canonical partition function for ideal Bose and Fermi gases. (b) Calculate the average occupation number in the canonical ensemble. Suggestion: instead of ZN , compute the quantity ∞ X

Z(x) =

xN ZN

N=0

H Z(x) 1 and determine ZN using ZN = 2πi dx, where the path in the complex x xN +1 plane encircles the origin, but includes no singularities of Z(x). Use the saddle– point method for evaluating the integral.

4.25 Calculate the chemical potential µ for the atomic limit of the Hubbard model, H=U

N X

ni↑ ni↓ ,

i=1

where ni↑ = c†i↑ ci↑ is the number operator for electrons in the state i (at lattice site i) and σ = + 12 . (In the general case, which is not under consideration here, the Hubbard model is given by: X X tij c†iσ cjσ + U ni↑ ni↓ . ) H= ijσ

i

5. Real Gases, Liquids, and Solutions

In this chapter, we consider real gases, that is we take the interactions of the atoms or molecules and their structures into account. In the ﬁrst section, the extension from the classical ideal gas will involve only including the internal degrees of freedom. In the second section, we consider mixtures of such ideal gases. The following sections take the interactions of the molecules into account, leading to the virial expansion and the van der Waals theory of the liquid and the gaseous phases. We will pay special attention to the transition between these two phases. In the ﬁnal section, we investigate mixtures. This chapter also contains references to every-day physics. It touches on bordering areas with applications in physical chemistry, biology, and technology.

5.1 The Ideal Molecular Gas 5.1.1 The Hamiltonian and the Partition Function We consider a gas consisting of N molecules, enumerated by the index n. In addition to their translational degrees of freedom, which we take to be classical as before, we now must consider the internal degrees of freedom (rotation, vibration, electronic excitation). The mutual interactions of the molecules will be neglected. The overall Hamiltonian contains the translational energy (kinetic energy of the molecular motion) and the Hamiltonian for the internal degrees of freedom Hi,n , summed over all the molecules: H=

N 2 p

n

n=1

2m

+ Hi,n

.

(5.1.1)

The eigenvalues of Hi,n are the internal energy levels εi,n . The partition function is given by P VN 3 3 − n p2n /2mkT Z(T, V, N ) = d p . . . d p e e−εi,n /kT . 1 N (2π)3N N ! n ε i,n

The classical treatment of the translational degrees of freedom, represented by the partition integral over momenta, is justiﬁed when the speciﬁc volume

226

5. Real Gases, Liquids, and Solutions

√ is much larger than the cube of the thermal wavelength λ = 2π/ 2πmkT (Chap. 4). Since the internal energy levels εi,n ≡ εi are identical for all of the molecules, it follows that N 1 1 V [Ztr (1) Zi ]N = Z(T, V, N ) = Z , (5.1.2) i N! N ! λ3 where Zi = εi e−εi /kT is the partition function over the internal degrees of freedom and Ztr (1) is the translational partition integral for a single molecule. From (5.1.2), we ﬁnd the free energy, using the Stirling approximation for large N : V F = −kT log Z ≈ −N kT 1 + log (5.1.3) + log Zi . N λ3 From (5.1.3), we obtain the equation of state

∂F N kT , P =− = ∂V T,N V

(5.1.4)

which is the same as that of a monatomic gas, since the internal degrees of freedom do not depend on V . For the entropy, we have

∂F ∂ log Zi V 5 S=− , (5.1.5a) = Nk + log Zi + T + log ∂T V,N 2 N λ3 ∂T and from it, we obtain the internal energy, 3 ∂ log Zi E = F + T S = N kT +T . 2 ∂T

(5.1.5b)

The caloric equation of state (5.1.5b) is altered by the internal degrees of freedom compared to that of a monatomic gas. Likewise, the internal degrees of freedom express themselves in the heat capacity at constant volume,

∂ 2 ∂ log Zi ∂E 3 CV = + T . (5.1.6) = Nk ∂T V,N 2 ∂T ∂T Finally, we give also the chemical potential for later applications:

∂F V µ= ; = −kT log Z i ∂N T,V N λ3

(5.1.5c)

it agrees with µ = N1 (F + P V ), since we are dealing with a homogeneous system. To continue the evaluation, we need to investigate the contributions due to the internal degrees of freedom. The energy levels of the internal degrees of freedom are composed of three contributions:

5.1 The Ideal Molecular Gas

εi = εel + εrot + εvib .

227

(5.1.7)

Here, εel refers to the electronic energy including the Coulomb repulsion of the nuclei relative to the energy of widely separated atoms. εrot is the rotational energy and εvib is the vibrational energy of the molecules. We consider diatomic molecules containing two diﬀerent atoms (e.g. HCl; for identical atoms, cf. Sect. 5.1.4). Then the rotational energy has the form1 εrot =

2 l(l + 1) , 2I

(5.1.8a)

where l is the angular momentum quantum number and I = mred R02 the moment of inertia, depending on the reduced mass mred and the distance between the atomic nuclei, R0 .2 The vibrational energy εvib takes the form1

1 , (5.1.8b) εvib = ω n + 2 where ω is the frequency of the molecular vibration and n = 0, 1, 2, . . .. The electronic energy levels εel can be compared to the dissociation energy εDiss . Since we want to consider non-dissociated molecules, i.e. we require that kT εDiss , and on the other hand the excitation energies of the lowest electronic levels are of the same order of magnitude as εDiss , it follows from the condition kT εDiss that the electrons must be in their ground state, whose energy we denote by ε0el . Then we have

0 ε (5.1.9) Zi = exp − el Zrot Zvib . kT We now consider in that order the rotational part Zrot and the vibrational part Zvib of the partition function. 5.1.2 The Rotational Contribution Since the rotational energy εrot (5.1.8a) does not depend on the quantum number m (the z component of the angular momentum), the sum over m just yields a factor (2l + 1), and only the sum over l remains, which runs over all the natural numbers

∞ l(l + 1)Θr Zrot = . (5.1.10) (2l + 1) exp − 2T l=0

1

2

In general, the moment of inertia I and the vibration frequency ω depend2 on l. The latter dependence leads to a coupling of the rotational and the vibrational degrees of freedom. For the following evaluation we have assumed that these dependences are weak and can be neglected. See e.g. QM I

228

5. Real Gases, Liquids, and Solutions

Here, we have introduced the characteristic temperature Θr =

2 . Ik

(5.1.11)

We next consider two limiting cases: T Θr : At low temperatures, only the smallest values of l contribute in (5.1.10) Zrot = 1 + 3 e−Θr /T + 5 e−3Θr /T + O e−6Θr /T . (5.1.12) T Θr : At high temperatures, the sum must be carried out over all l values, leading to

1 Θr Θr 2 T 1 Zrot = 2 +O . (5.1.13) + + Θr 3 30 T T To prove (5.1.13), one uses the Euler–MacLaurin summation formula3 ∞ X

Z∞ f (l) =

l=0

X (−1)k Bk (2k−1) 1 (0) + Restn , f (0) + f 2 (2k)! n−1

dl f (l) +

(5.1.14)

k=1

0

for the special case that f (∞) = f (∞) = . . . = 0. The ﬁrst Bernoulli numbers Bn 1 are given by B1 = 16 , B2 = 30 . The ﬁrst term in (5.1.14) yields just the classical result Z∞

Z∞ dl f (l) =

0

0

„

l(l + 1) Θr dl (2l + 1) exp − 2 T

«

Z∞ =2

dx e−x

Θr T

= 2

T , (5.1.15) Θr

0

which one would also obtain by treating the rotational energy classically instead of quantum-mechanically.4 The further terms are found via „ «3 „ «2 Θr Θr 1 Θr Θr − , f (0) = 1 , f (0) = 2 − , f (0) = −6 +3 2T T T 8 T from which, using (5.1.14), we obtain the expansion (5.1.13).

From (5.1.12) and (5.1.13), we ﬁnd for the logarithm of the partition function after expanding: 3

4

Whittaker, Watson, A Modern Course of Analysis, Cambridge at the Clarendon Press; V. I. Smirnow, A Course of Higher Mathematics, Pergamon Press, Oxford 1964: Vol. III, Part 2, p. 290. See e.g. A. Sommerfeld, Thermodynamics and Statistical Physics, Academic Press, NY 1950 Z Z βI 2 2 4πI 2 2IkT . dω dω2 e− 2 (ω1 +ω2 ) = Zrot = 1 2 (2π) 2

5.1 The Ideal Molecular Gas

log Zrot

⎧ 9 ⎪ 3 e−Θr /T − e−2Θr /T + O e−3Θr /T ⎪ ⎪ 2 ⎨ =

2T Θ ⎪ ⎪ 1 Θr 2 Θr 3 r ⎪ ⎩log + + +O Θr 6T 360 T T

229

T Θr T Θr . (5.1.16a)

From this result, the contribution of the rotational degrees of freedom to the internal energy can be calculated: ∂ Erot = N kT 2 log Zrot ∂T ⎧ −Θr /T ⎪3N k Θr e − 3 e−2Θr /T + . . . T Θr ⎪ ⎨ (5.1.16b) = 2 ⎪ ⎪ ⎩N kT 1 − Θr − 1 Θr + . . . T Θr . 6T 180 T The contribution to the heat capacity at constant volume is then ⎧ 2 Θr −Θr /T −Θr /T ⎪ ⎪ 3 1 − 6 e e + . . . T Θr ⎪ ⎨ T rot CV = N k (5.1.16c) ⎪ Θ 2 ⎪ 1 ⎪ r ⎩1 + + ... T Θr . 180 T In Fig. 5.1, we show the rotational contribution to the speciﬁc heat.

Fig. 5.1. The rotational contribution to the speciﬁc heat

At low temperatures, the rotational degrees of freedom are not thermally excited. Only at T ≈ Θr /2 do the rotational levels contribute. At high temperatures, i.e. in the classical region, the two rotational degrees of freedom make a contribution of 2kT /2 to the internal energy. Only with the aid of quantum mechanics did it become possible to understand why, in contradiction to the equipartition theorem of classical physics, the speciﬁc heat per molecule can diﬀer from the number of degrees of freedom multiplied by k/2. The rotational contribution to the speciﬁc heat has a maximum of 1.1 at the temperature 0.81 Θr /2 . For HCl, Θr /2 is found to be 15.02 K.

230

5. Real Gases, Liquids, and Solutions

5.1.3 The Vibrational Contribution We now come to the vibrational contribution, for which we introduce a characteristic temperature deﬁned by ω = kΘv .

(5.1.17)

We obtain the well-known partition function of a harmonic oscillator Zvib =

∞ n=0

e−(n+ 2 ) 1

Θv T

=

e−Θv /2T , 1 − e−Θv /T

(5.1.18)

−Θv /T v whose logarithm is given by log Zvib = − Θ . From it, we 2T − log 1 − e ﬁnd for the internal energy: 1 1 2 ∂ Evib = N k T log Zvib = N k Θv + Θ /T , (5.1.19a) v ∂T 2 e −1 and for the vibrational contribution to the heat capacity at constant volume CVvib = N k

eΘv /T 1 Θv2 Θv2 = N k .

2 T 2 eΘv /T − 1 T 2 [2sinh Θv /2T ]2

(5.1.19b)

At low and high temperatures, from (5.1.19b) we obtain the limiting cases ⎧ 2 Θv ⎪ ⎪ e−Θv /T + . . . T Θv ⎪ ⎨ vib T CV = (5.1.19c) ⎪ Nk Θ 2 ⎪ 1 ⎪ v ⎩1 − + . . . T Θv . 12 T The excited vibrational energy levels are noticeably populated only at temperatures above Θv . The speciﬁc heat (5.1.19b) is shown in Fig. 5.2.

Fig. 5.2. The vibrational part of the speciﬁc heat (Eq. (5.1.19b))

The contribution of the electronic energy ε0el to the partition function, free energy, internal energy, entropy, and to the chemical potential is, from (5.1.9):

5.1 The Ideal Molecular Gas

Zel = e

−ε0el /kT

,

Fel = N ε0el ,

Eel = N ε0el ,

Sel = 0 ,

231

µel = ε0el . (5.1.20)

These contributions play a role in chemical reactions, where the (outer) electronic shells of the atoms undergo complete restructuring. In a diatomic molecular gas, there are three degrees of freedom due to translation, two degrees of freedom of rotation, and one vibrational degree p2 2 2 of freedom, which counts double (E = 2m +m 2 ω x ; kinetic and potential 1 energy each contribute 2 kT ). The classical speciﬁc heat is therefore 7k/2, as is observed experimentally at high temperatures. All together, this gives the temperature dependence of the speciﬁc heat as shown in Fig. 5.3. The curve is not continued down to a temperature of T = 0, since there the approximation of a classical ideal gas is certainly no longer valid.

Fig. 5.3. The speciﬁc heat of a molecular gas at constant volume (schematic)

The rotational levels correspond to a wavelength of λ = 0.1 − 1 cm and lie in the far infrared and microwave regions, while the vibrational levels at wavelengths of λ = 2 × 10−3 − 3 × 10−3 cm are in the infrared. The corresponding energies are 10−3 −10−4 eV and 0.06−0.04 eV, resp. (Fig. 5.4). ∧ One electron volt corresponds to about 11000 K (1 K = 0.86171 × 10−4 eV). Some values of Θr and Θv are collected in Table 5.1. In more complicated molecules, there are three rotational degrees of freedom and more vibrational degrees of freedom (for n atoms, in general 3n − 6 vibrational degrees of freedom, and for linear molecules, 3n − 5). In precise experiments, the coupling between the vibrational and rotational degrees of freedom and the anharmonicities in the vibrational degrees of freedom are also detected.

1 Θ 2 r

[K] Θv [K]

H2

HD

D2

HCl

O2

85 6100

64 5300

43 4300

15 4100

2 2200

Table 5.1. The values of Θr /2 and Θv for several molecules

232

5. Real Gases, Liquids, and Solutions

Fig. 5.4. The structure of the rotational and vibrational levels (schematic) ∗

5.1.4 The Inﬂuence of the Nuclear Spin

We emphasize from the outset that here, we make the assumption that the electronic ground state has zero orbital and spin angular momenta. For nuclei A and B, which have diﬀerent nuclear spins SA and SB , one obtains an additional factor in the partition function, (2SA + 1)(2SB + 1), i.e.Zi → (2SA + 1)(2SB + 1)Zi . This leads to an additional term in the free energy per molecule of −kT log(2SA + 1)(2SB + 1), and to a contribution of k log(2SA + 1)(2SB + 1) to the entropy, i.e. a change of the chemical constants by log(2SA + 1)(2SB + 1) (see Eq. (3.9.29) and (5.2.5 )). As a result, the internal energy and the speciﬁc heats remain unchanged. For molecules such as H2 , D2 , O2 which contain identical atoms, one must observe the Pauli principle. We consider the case of H2 , where the spin of the individual nuclei is SN = 1/2. Ortho hydrogen molecule:

Nuclear spin triplet (Stot = 1); the spatial wavefunction of the nuclei is antisymmetric (l = odd (u))

Para hydrogen molecule:

Nuclear spin singlet (Stot = 0); the spatial wavefunction of the nuclei is symmetric (l = even (g))

l(l + 1) Θr (2l + 1) exp − 2 T l odd(u)

l(l + 1) Θr . Zg = (2l + 1) exp − 2 T

Zu =

l even(g)

(5.1.21a) (5.1.21b)

5.1 The Ideal Molecular Gas

233

In complete equilibrium, we have Z = 3Zu + Zg . At T = 0, the equilibrium state is the ground state l = 0, i.e. a para state. In fact, owing to the slowness of the transition between the two spin states at T = 0, a mixture of ortho and para hydrogen will be obtained. At high temperatures, Zu ≈ Zg ≈ 12 Zrot = ΘTr holds and the mixing ratio of ortho to para hydrogen is 3:1. If we start from this state and cool the sample, then, leaving ortho-para conversion out of consideration, H2 consists of a mixture of two types of molecules: 34 N ortho and 14 N para hydrogen, and the partition function of this (metastable) non-equilibrium state is Z = (Zu )3/4 (Zg )1/4 .

(5.1.22)

Then for the speciﬁc heat, we obtain CVrot =

3 rot 1 rot C + C . 4 Vo 4 Vp

(5.1.23)

In Fig. 5.5, the rotational parts of the speciﬁc heat in the metastable state ( 34 ortho and 14 para), as well as for the case of complete equilibrium, are shown. The establishment of equilibrium can be accelerated by using catalysts.

Fig. 5.5. The rotational part of the speciﬁc heat of diatomic molecules such as H2 : equilibrium (solid curve), metastable mixture (dashed)

In deuterium molecules, D2 , the nuclear spin per atom is S = 1,5 which can couple in the molecule to ortho deuterium with a total spin of 2 or 0 and para deuterium with a total spin of 1. The degeneracy of these states is 6 and 3. The associated orbital angular momenta are even (g) and odd (u). The partition function, corresponding to Eq. (5.1.21a-b), is given by Z = 6Zg + 3Zu .

5

QM I, page 187

234 ∗

5. Real Gases, Liquids, and Solutions

5.2 Mixtures of Ideal Molecular Gases

In this section, we investigate the thermodynamic properties of mixtures of molecular gases. The diﬀerent types of particles (elements), of which there are supposed to be n, are enumerated by the index j. Then Nj refers to the N particle number, λj = (2πmjhkT )1/2 is the thermal wavelength, cj = Nj the concentration, ε0el,j the electronic ground state energy, Zj the overall partition function (see (5.1.2)), and Zi,j the partition function for the internal degrees of freedom of the particles of type j. Here, in contrast to (5.1.9), the electronic part is separated out. The total number of particles is N = j Nj . The overall partition function of this non-interacting system is Z=

n

Zj ,

(5.2.1)

j=1

and from it we ﬁnd the free energy V Zi,j + Nj 1 + log ε0el,j Nj . F = −kT 3 N λ j j j j From (5.2.2), we obtain the pressure, P = − P =

∂F ∂V

T,{Nj }

(5.2.2) ,

kT N kT . Nj = V j V

(5.2.3)

The equation of state (5.2.3) is identical to that of the monatomic ideal gas, since the pressure is due to the translational degrees of freedom. For the chemical potential µj of the component j (Sect. 3.9.1), we ﬁnd

∂F V Zi,j = −kT log + ε0el,j ; (5.2.4) µj = ∂Nj T,V Nj λ3j or, if we use the pressure from (5.2.3) instead of the volume, µj = −kT log

kT Zi,j + ε0el,j . cj P λ3j

(5.2.4 )

We now assume that the rotational degrees of freedom are completely unfrozen, but not the vibrational degrees of freedom (Θr T Θv ). Then inserting Zi,j = Zrot,j = Θ2T (see Eq. (5.1.13)) into (5.2.4 ) yields r,j 3/2

mj 7 µj = ε0el,j − kT log kT − kT log 1/2 3/2 3 + kT log cj P . (5.2.5) 2 2 π kΘr,j We have taken the fact that the masses and the characteristic temperatures depend on the type of particle j into consideration here. The pressure enters

∗

5.2 Mixtures of Ideal Molecular Gases

235

the chemical potential of the component j in the combination cj P = Pj (partial pressure). The chemical potential (5.2.5) is a special case of the general form µj = ε0el,j − cP,j T log kT − kT ζj + kT log cj P .

(5.2.5 )

For diatomic molecules in the temperature range mentioned above, cP,j = 7k/2. The ζj are called chemical constants; they enter into the law of mass action (see Chap. 3.9.3). For the entropy, we ﬁnd

∂µj S=− Nj ∂T P,{Ni } j = (cP,j log kT + cP,j + kζj − k log cj P ) Nj , (5.2.6) j

from which one can see that the coeﬃcient cP,j is the speciﬁc heat at constant pressure of the component j. Remarks to Sections 5.1 and 5.2: In the preceding sections, we have described the essential eﬀects of the internal degrees of freedom of molecular gases. We now add some supplementary remarks about additional eﬀects which depend upon the particular atomic structure. (i) We ﬁrst consider monatomic gases. The only internal degrees of freedom are electronic. In the noble gases, the electronic ground state has L = S = 0 and is thus not degenerate. The excited levels lie about 20 eV above the ground state, corresponding to a temperature of 200.000 K higher; in practice, they are therefore not thermally populated, and all the atoms remain in their ground state. One can also say that the electronic degrees of freedom are “frozen out”. The nuclear spin SN leads to a degeneracy factor (2SN + 1). Relative to pointlike classical particles, the partition function contains an additional factor (2SN + 1)e−ε0 /kT , which gives rise to a contribution to the free energy of ε0 − kT log(2SN + 1). This leads to an additional term of k log(2SN + 1) in the entropy, but not to a change in the speciﬁc heat. (ii) The excitation energies of other atoms are not as high as in the case of the noble gases, e.g. 2.1 eV for Na, or 24.000 K, but still, the excited states are not thermally populated. When the electronic shell of the atom has a nonzero S, but still L = 0, this leads together with the nuclear spin to a degeneracy factor of (2S +1)(2SN +1). The free energy then contains the additional term ε0 − kT log((2SN + 1)(2S + 1)) with the consequences discussed above. Here, to be sure, one must consider the magnetic interaction between the nuclear and the electronic moments, which leads to the hyperﬁne interaction. This is e.g. in hydrogen of the order of 6 × 10−6 eV, leading to the well-known 21 cm line. The corresponding characteristic temperature is 0.07 K. The hyperﬁne splitting can therefore be completely neglected in the gas phase. (iii) In the case that both the spin S and the orbital angular momentum L are nonzero, the ground state is (2S + 1)(2L + 1)-fold degenerate; this degeneracy is partially lifted by the spin-orbit coupling. The energy eigenvalues depend on the total angular momentum J, which takes on values between S + L and |S − L|. For example, monatomic halogens in their ground state have S = 12 and L = 1, according to Hund’s ﬁrst two rules. Because of the spin-orbit coupling, in the ground

236

5. Real Gases, Liquids, and Solutions

state J = 32 , and the levels with J = 12 have a higher energy. For e.g. chlorine, the doubly-degenerate 2 P1/2 level lies δε = 0.11 eV above the 4-fold degenerate 2 P3/2 ground state level. This corresponds to a temperature of δε = 1270 K. The k partition function now contains a factor Zel = 4 e−ε0 /kT + 2 e−(ε0 +δε)/kT due to the internal ﬁne-structure degrees of freedom, “ which leads ” to an additional term in δε − kT . This yields the following the free energy of −kT log Zel = ε0 − kT log 4 + 2 e electronic contribution to the speciﬁc heat: ` δε ´2 δε 2 kT e kT el CV = N k “ ”2 . δε 2 e kT + 1 For T δε/k, Zel = 4, only the four lowest levels are populated, and CVel = 0. For T δε/k, Zel = 6, and all six levels are equally occupied, so that CVel = 0. For temperatures between these extremes, CVel passes through a maximum at about δε/k. Both at low and at high temperatures, the ﬁne structure levels express themselves only in the degeneracy factors, but do not contribute to the speciﬁc heat. One should note however that monatomic Cl is present only at very high temperatures, and otherwise bonds to give Cl2 . (iv) In diatomic molecules, in many cases the lowest electronic state is not degenerate and the excited electronic levels are far from ε0 . The internal partition function contains only the factor e−ε0 /kT due to the electrons. There are, however, molecules which have a ﬁnite orbital angular momentum Λ or spin. This is the case in NO, for example. Since the orbital angular momentum has two possible orientations relative to the molecular axis, a factor of 2 in the partition function results. A ﬁnite electronic spin leads to a factor (2S+1). For S = 0 and Λ = 0, there are again ﬁne-structure eﬀects which can be of the right order of magnitude to inﬂuence the thermodynamic properties. The resulting expressions take the same form as those in Remark (iii). A special case is that of the oxygen molecule, O2 . Its ground state 3 Σ has zero orbital angular momentum and spin S = 1; it is thus a triplet without ﬁne structure. The ﬁrst excited level 1 ∆ is doubly degenerate and lies relatively ∧ 11300 K, so that it can be populated at high temperatures. near at δε = 0.97 eV = −ε0 ` −δε ´ These electronic conﬁgurations lead to a factor of e kT 3 + 2 e kT in the partition function, with the consequences discussed in Remark (iii).

5.3 The Virial Expansion 5.3.1 Derivation We now investigate a real gas, in which the particles interact with each other. In this case, the partition function can no longer be exactly calculated. For its evaluation, as a ﬁrst step we will describe the virial expansion, an expansion in terms of the density. The grand partition function ZG can be decomposed into the contributions for 0,1,2, etc. particles ZG = Tr e−(H−µN )/kT = 1+Z(T, V, 1) eµ/kT +Z(T, V, 2) e2µ/kT +. . . , (5.3.1) where ZN ≡ Z(T, V, N ) represents the partition function for N particles.

5.3 The Virial Expansion

237

From it, we obtain the grand potential, making use of the Taylor series expansion of the logarithm 1 Φ = −kT log ZG = −kT Z1 eµ/kT + Z2 − Z12 e2µ/kT + . . . , 2

(5.3.2)

where the logarithm has been expanded in powers of the fugacity z = eµ/kT . Taking the derivatives of (5.3.2) with respect to the chemical potential, we obtain the mean particle number

1 ¯ = − ∂Φ N = Z1 eµ/kT + 2 Z2 − Z12 e2µ/kT + . . . . (5.3.3) ∂µ T,V 2 Eq. (5.3.3) can be solved iteratively for eµ/kT , with the result 2 ¯ ¯ 2 Z2 − 12 Z12 N N eµ/kT = − + ... . Z1 Z1 Z1

(5.3.4)

Eq. (5.3.4) represents a series expansion of eµ/kT in terms of the density, since Z1 ∼ V . Inserting (5.3.4) into Φ has the eﬀect that Φ is given in terms ¯ instead of its natural variables T, V, µ, which is favorable for conof T, V, N structing the equation of state: ¯2 ¯ − Z2 − 1 Z12 N + . . . . Φ = −kT N 2 2 Z1

(5.3.5)

These are the ﬁrst terms of the so called virial expansion. By application of the Gibbs–Duhem relation Φ = −P V , one can go from it directly to the expansion of the equation of state in terms of the particle number density ¯ /V ρ=N P = kT ρ 1 + B(T )ρ + C(T )ρ2 + . . . . (5.3.6) The coeﬃcient of ρn in square brackets is called the (n+1)th virial coeﬃcient. The leading correction to the equation of state of an ideal gas is determined by the second virial coeﬃcient 1 B = − Z2 − Z12 V /Z12 . 2

(5.3.7)

This expression holds both in classical and in quantum mechanics. Note: in the classical limit the integrations over momentum can be carried out, and (5.3.1) is simpliﬁed as follows: ZG (T, V, µ) =

∞ X eβµN Q(T, V, N ) . N !λ3N N=0

(5.3.8)

238

5. Real Gases, Liquids, and Solutions

Here, Q(T, V, N ) Z(T, V, N )

is Z

Q(T, V, N ) = V

Z

the

conﬁgurational

d3N x e−β

part

Z

P i<j

vij

d3N x

=

of Y

the

partition

function

(1 + fij ) =

i<j

V

(5.3.9)

d3N x [1 + (f12 + f13 + . . .) + (f12 f13 + . . .) + . . .]

= V

P P P with fij = e − 1. In this expression, i<j ≡ 12 i j=i refers to the sum over all pairs of particles. One can see from this that the virial expansion represents an expansion in terms of r03 /v, where r0 is the range of the potential. The classical expansion is valid for λ r0 v 1/3 ; see Eqs. (B.39a) and (B.39b) in Appendix B. Equation (5.3.9) can be used as the basis of a systematic graph-theoretical expansion (Ursell and Mayer 1939). −βvij

5.3.2 The Classical Approximation for the Second Virial Coeﬃcient In the case of a classical gas, one ﬁnds for the partition function for N particles P 2 1 3 3 ZN = d p1 . . . d pN d3 x1 . . . d3 xN e( i pi /2m+v(x1 ,...,xN )/kT ) ; 3N N !h (5.3.10a) after integrating over the 3N momenta, this becomes 1 d3 x1 . . . d3 xN e−v(x1 ,...,xN )/kT , ZN = 3N λ N!

(5.3.10b)

where v(x1 , . . . , xN ) is the total potential of the N particles. The integrals over xi are restricted to the volume V . If no external potential is present, and the system is translationally invariant, so that the two-particle interaction depends only upon x1 − x2 , we ﬁnd from (5.3.10b) 1 V d3 x1 e0 = 3 (5.3.11a) Z1 = 3 λ λ and Z2 =

1 2λ6

d3 x1 d3 x2 e−v(x1 −x2 )/kT =

V 2λ6

d3 y e−v(y)/kT . (5.3.11b)

This gives for the second virial coeﬃcient (5.3.7): 1 1 d3 y f (y) = − d3 y e−v(y)/kT − 1 B=− 2 2

(5.3.12)

with f (y) = e−v(y)/kT − 1. To proceed, we now require the two-particle potential v(y), also known as the pair potential. In Fig. 5.6, as an example, the Lennard–Jones potential is shown; it ﬁnds applications in theoretical models for the description of gases and liquids and it is deﬁned in Eq. (5.3.16).

5.3 The Virial Expansion

239

Fig. 5.6. The Lennard–Jones potential as an example of a pair potential v(y), Eq. (5.3.16)

5.3.2.1 A Qualitative Estimate of B(T ) A typical characteristic of realistic potentials is the strong increase for overlapping atomic shells and the attractive interaction at larger distances. A typical shape is shown in Fig. 5.7. Up to the so called ‘hard-core’ radius σ, the potential is inﬁnite, and outside this radius it is weakly negative. Thus the shape of f (r) as shown in Fig. 5.7 results. If we can now assume that in the region of the negative potential, v(x) kT 1, then we ﬁnd for the function in (5.3.12) ⎧ ⎨−1 |x| < σ . f (x) = v(x) ⎩− |x| ≥ σ kT From this, we obtain the second virial coeﬃcient: 1 4π B(T ) ≈ − − σ 3 + 4π 2 3

∞

a , dr r2 (−v(r))/kT = b − kT

(5.3.13)

(5.3.14)

σ

where 2π 3 4π 3 σ =4 r (5.3.15a) 3 3 0 denotes the fourfold molecular volume. For hard spheres of radius r0 , σ = 2r0 and ∞ 1 d3 x v(x)Θ(r − σ) . a = −2π dr r2 v(r) = − (5.3.15b) 2 b=

σ

The result (5.3.14) for B(T ) is drawn in Fig. 5.8. In fact, B(T ) decreases again at higher temperatures, since the potential in Nature, unlike the artiﬁcial case of inﬁnitely hard spheres, is not inﬁnitely high (see Fig. 5.9). Remark: From the experimental determination of the temperature dependence of the virial coeﬃcients, we can gain information about the potential.

240

5. Real Gases, Liquids, and Solutions

Fig. 5.7. A typical pair potential v(r) (solid curve) and the associated f (r) (dashed).

Fig. 5.8. The second virial coeﬃcient from the approximate relation (5.3.14)

Examples: Lennard–Jones potential ((12-6)-potential): σ 12 σ 6 v(r) = 4ε . − r r exp-6-Potential :

a−r σ2 6 v(r) = ε exp − σ1 r

.

(5.3.16)

(5.3.17)

The exp-6-potential is a special case of the so called Buckingham potential, which also contains a term ∝ −r−8 . 5.3.2.2 The Lennard–Jones Potential We will now discuss the second virial coeﬃcient in the case of a Lennard– Jones potential σ 12 σ 6 v(r) = 4ε . − r r It proves expedient to introduce the dimensionless variables r∗ = r/σ and T ∗ = kT /ε. Integrating (5.3.12) by parts yields 4 1 1 12 2π 3 4 6 ∗ ∗2 (5.3.18) σ B(T ) = dr r − ∗ 6 e− T ∗ [ r∗ 12 − r∗ 6 ] . 12 ∗ ∗ 3 T r r Expansion of the factor exp T ∗4r∗ 6 in terms of T ∗4r∗ 6 leads to

∞ 2π 3 2j−3/2 2j − 1 T ∗ −(2j+1)/4 B(T ) = − σ Γ 3 j! 4 j=0 (5.3.19) 2π 3 1.73 2.56 0.87 σ = − − − ... 3 T ∗ 1/4 T ∗ 3/4 T ∗ 5/4

5.3 The Virial Expansion

241

Fig. 5.9. The reduced second virial coeﬃcient B ∗ = 3B/2πLσ 3 for the Lennard– Jones potential. L denotes the Loschmidt number (Avagadro’s number, L = 6.0221367 · 1023 mol−1 ); after Hirschfelder et al.6 and R. J. Lunbeck, Dissertation, Amsterdam 1950

(see Hirschfelder et al.6 Eq. (3.63)); the series converges quickly at large T ∗ . In Fig. 5.9, the reduced second virial coeﬃcient is shown as a function of T ∗ . Remarks: (i) The agreement for the noble gases Ne, Ar, Kr, Xe after adjustment of σ and ε is good. (ii) At T ∗ > 100, the decrease in B(T ) is experimentally somewhat greater than predicted by the Lennard–Jones interaction (i.e. the repulsion is weaker). (iii) An improved ﬁt to the experimental values is obtained with the exp-6potential (5.3.17). (iv) The possibility of representing the second virial coeﬃcients for classical gases in a uniﬁed form by introducing dimensionless quantities is an expression of the so called law of corresponding states (see Sect. 5.4.3).

5.3.3 Quantum Corrections to the Virial Coeﬃcients The quantum-mechanical expression for the second virial coeﬃcient B(T ) is given by (5.3.7), where the partition functions occurring there are to be computed quantum mechanically. The quantum corrections to B(T ) and the 6

T. O. Hirschfelder, Ch. F. Curtiss and R. B. Bird, Molecular Theory of Gases and Liquids, John Wiley and Sons, Inc., New York 1954

242

5. Real Gases, Liquids, and Solutions

other virial coeﬃcients are of two kinds: There are corrections which result from statistics (Bose or Fermi statistics). In addition, there are corrections which arise from the non-commutativity of quantum mechanical observables. The corrections due to statistics are of the order of B=∓

λ3 ∝ 3 25/2

for

bosons , fermions

(5.3.20)

as one can see from Sect. 4.2 or Eq. (B.43). The interaction quantum corrections, according to Eq. (B.46), take the form 2 Bqm = d3 y e−v(y)/kT (∇v(y))2 , (5.3.21) 24m(kT )2 and are thus of the order of 2 . The lowest-order correction given in (5.3.21) results from the non-commutativity of p2 and v(x). We show in Appendix B.33 that the second virial coeﬃcient can be related to the time which the colliding particles spend within their mutual potential. The shorter this time, the more closely the gas obeys the classical equation of state for an ideal gas.

5.4 The Van der Waals Equation of State 5.4.1 Derivation We now turn to the derivation of the equation of state of a classical, real (i.e. interacting) gas. We assume that the interactions of the gas atoms (molecules) consist only of a two-particle potential, which can be decomposed into a hard-core (H.C.) part, vH.C. (y) for |y| ≤ σ, and an attractive part, w(y) (see Fig. 5.7): v(y) = vH.C. (y) + w(y) .

(5.4.1)

The expression “hard core” means that the gas molecules repel each other at short distances like impenetrable hard spheres, which is in fact approximately the case in Nature. Our task is now to determine the partition function, for which after carrying out the integrations over momenta we obtain P 1 Z(T, V, N ) = 3N d3 x1 . . . d3 xN e− i<j v(xi −xj )/kT . (5.4.2) λ N! We still have to compute the conﬁgurational part. This can of course not be carried out exactly, but instead contains some intuitive approximations. Let us ﬁrst ignore the attractive interaction and consider only the hard-core potential. This yields in the partition function for many particles:

5.4 The Van der Waals Equation of State

d3 x1 . . . d3 xN e−

P i<j

vH.C. (xij )/kT

≈ (V − V0 )N .

243

(5.4.3)

This result can be made plausible as follows: if the hard-core radius were zero, σ = 0, then the integration in (5.4.3) would give simply V N ; for a ﬁnite σ, each particle has only V − V0 available, where V0 is the volume occupied by the other N − 1 particles. This is not exact, since the size of the free volume (V − V0 ) depends on the conﬁguration, as can be seen from Fig. 5.10. In (5.4.3), V0 is to be understood the occupied volume for typical conﬁgurations which have a large statistical weight. Then, one can imagine carrying out the integrations in (5.4.3) successively, obtaining a factor V −V0 for each particle. Referring to Fig. 5.10, we can ﬁnd the following bounds for V0 with a particle √ number N : the smallest V0 is obtained for spherical closest packing, V0min = 4 2 r03 N = 5.65 r03 N . The largest V0 is found when the spheres of radius 2r0 do not overlap, i.e. V0max = 8 4π r03 N = 33.51 r03 N . The actual V0 will lie between these extremes 3 and can be determined as below from the comparison with the virial expansion, namely V0 = bN = 4 4π r03 N = 16.75 r03 N . 3

Using (5.4.3), we can cast the partition function (5.4.2) in the form P (V − V0 )N d3 x1 . . . d3 xN e−H.C. e− i<j w(xi −xj )/kT . Z(T, V, N ) = λ3N N ! d3 x1 . . . d3 xN e−H.C. (5.4.4) Here, H.C. stands for the sum of all contributions from the hard-core potential divided / by kT . The second 0 fraction can be interpreted as the average of exp − i<j w(xi − xj )/kT in a gas which experiences only hard-core interactions. Before we treat this in more detail, we want to consider the second exponent more closely. For potentials whose range is much greater than σ and the distance between particles, it follows approximately that the potential acting on j due to the other particles,

Fig. 5.10. Two conﬁgurations of three atoms within the volume V . In the ﬁrst conﬁguration, V0 is larger than in the second. The center of gravity of an additional atom must be located outside the dashed circles. In the second conﬁguration (closer packing), there will be more space for an additional atom (spheres of radius r0 are represented by solid circles, spheres of radius σ = 2r0 by dashed circles)

244

5. Real Gases, Liquids, and Solutions

i =j

w(xi − xj ) ≈ (N − 1)

w(xi − xj ) ≡

i<j

with 1 w ¯= V

d3 x V w(x),

i.e. the sum over all pairs

1 1 1 ¯ ≈ N 2w w(xi − xj ) ≈ N (N − 1)w ¯ (5.4.5a) 2 i 2 2 i =j

d3 x w(x) ≡ −

2a . V

(5.4.5b)

Thus we ﬁnd for the partition function Z(T, V, N ) =

w ¯ (V − V0 )N N 2 a (V − V0 )N − N (N −1) kT 2 e e V kT . = 3N λ N! λ3N N !

(5.4.6)

In this calculation, the attractive part of the potential was replaced by its average value. Here, as in the molecular ﬁeld theory for ferromagnetism which will be treated in the next chapter, we are using an “average-potential approximation”. Before we discuss the thermodynamic consequences of (5.4.6), we return once more to (5.4.4) and the note which followed it. The last factor can be written using a cumulant expansion, Eq. (1.2.16 ), in the form " P = exp − w(xi − xj )/kT e− i<j w(xi −xj )/kT H.C.

+

1 2

w(xi − xj )/kT

i<j

2

i<j

−

H.C.

i<j

H.C.

w(xi − xj )/kT

2

# +. . . .

H.C.

(5.4.7) The average values H.C. are taken with respect to the canonical distribution function of the total hard-core potential. Therefore, i<j w(xi − xj ) H.C. refers to the average of the attractive potential in the “free” volume allowed by the interaction of hard spheres. Under the assumption made earlier that the range is much greater than the hard-core radius σ and the particle distance, we again ﬁnd (5.4.5a,b) and (5.4.6). The second term in the cumulant series (5.4.7) represents the mean square deviation of the attractive interactions. The higher the temperature, the more dominant the term w/kT ¯ becomes. √ From (5.4.6), using N ! N N e−N 2πN , we obtain the free energy, F = −kT N log

e(V − V0 ) N 2 a − , λ3 N V

the pressure (the thermal equation of state),

kT N N 2a ∂F = − 2 , P =− ∂V T,N V − V0 V

(5.4.8)

(5.4.9)

5.4 The Van der Waals Equation of State

and, with E = −T

2

F

∂ ∂T T

V,N

245

, the internal energy (caloric equation of state),

3 N 2a N kT − . (5.4.10) 2 V Finally, we can relate V0 to the second virial coeﬃcient. To do this, we expand (5.4.9) in terms of 1/V and identify the result with the virial expansion (5.3.6) and (5.3.14): kT N V0 aN kT N a N P = 1+ − + ... ≡ 1+ b− + ... . V V kT V V kT V E=

From this, we obtain V0 = N b ,

(5.4.11)

where b is the contribution to the second virial coeﬃcient which results from the repulsive part of the potential. Inserting in (5.4.9), we ﬁnd P =

a kT − , v − b v2

(5.4.12)

where on the right-hand side, the speciﬁc volume v = V /N was introduced. Equation (5.4.12) or equivalently (5.4.9) is the van der Waals equation of state for real gases,7 and (5.4.10) is the associated caloric equation of state. Remarks: (i) The van der Waals equation (5.4.12) has, in comparison to the ideal gas equation P = kT /v, the following properties: the volume v is replaced by v − b, the free volume. For v = b, the pressure would become inﬁnite. This modiﬁcation with respect to the ideal gas is caused by the repulsive part of the potential. (ii) The attractive interaction causes a reduction in the pressure via the term −a/v 2 . This reduction becomes relatively more important as the temperature is lowered. (iii) We make another comparison of the van der Waals equation to the ideal gas equation by writing (5.4.12) in the form “ a” P + 2 (v − b) = kT . v Compared to P v = kT , the speciﬁc volume v has been decreased by b, because the molecules are not pointlike, but instead occupy their own ﬁnite volumes. The mutual attraction of the molecules leads at a given pressure to a reduction of the volume; it thus acts like an additional pressure term. One can also readily understand the proportionality of this term to 1/v 2 . If one considers the surface layer of a liquid, it experiences a kind of attractive force from the deeper-lying layers, which must be proportional to the square of the density, since if the density were increased, the number of molecules in each layer would increase in proportion to the density, and the attractive force per unit area would thus increase proportionally to 1/v 2 . 7

Johannes Dietrich van der Waals, 1837-1923: equation of state formulated 1873, Nobel prize 1910

246

5. Real Gases, Liquids, and Solutions

The combined action of the two terms in the van der Waals equation results in qualitatively diﬀerent shapes for the isotherms at low (T1 , T2 ) and at high (T3 , T4 ) temperatures. The family of van der Waals isotherms is shown in Fig. 5.11. For T > Tc , the isotherms are monotonic, while for T < Tc , they are S-shaped; the signiﬁcance of this will be discussed below.

Fig. 5.11. The van der Waals isotherms in dimensionless units P/Pc and v/vc

We see immediately that on the so called critical isotherm, there is a critical point, at which the ﬁrst and second derivatives vanish, i.e. a horizontal point ∂P ∂2P of inﬂection. The critical point Tc , Pc , Vc thus follows from ∂V = ∂V 2 = 0. kT 2a kT 3a + = 0, − = 0, from This leads to the two conditions − (v−b) 2 v3 (v−b)3 v4 which the values vc = 3b ,

kTc =

8 a , 27 b

Pc =

a 27b2

(5.4.13)

are obtained. The dimensionless ratio 8 kTc = = 2.6˙ Pc vc 3

(5.4.14)

follows from this. The experimental value is found to be somewhat larger. Note: It is apparent even from the derivation that the van der Waals equation can have only approximate validity. This is true of both the reduction of the repulsion eﬀects to an eﬀective molecular volume b, and of the replacement of the attractive (negative) part of the potential by its average value. The latter approximation improves as the range of the interactions increases. In the derivation, correlation eﬀects were neglected, which is questionable especially in the neighborhood of the critical point, where strong density ﬂuctuations will occur (see below). However, the van der Waals equation, in part with empirically modiﬁed van der Waals constants a and b, is able to give a qualitative description of condensation and of the behavior in the neighborhood of the critical point. There are numerous variations on the van der Waals equation; e.g. Clausius suggested the equation

5.4 The Van der Waals Equation of State

247

Fig. 5.12. Isotherms for carbonic acid obtained from Clausius’ equation of state. From M. Planck, Thermodynamik, Veit & Comp, Leipzig, 1897, page 14

P =

c kT . − v−a T (v + b)2

The plot of its isotherms shown in Fig. 5.12 is similar to that obtained from the van der Waals theory.

5.4.2 The Maxwell Construction At temperatures below Tc , the van der Waals isotherms have a typical Sshape (Fig. 5.12). The regions in which (∂P/∂V )T > 0, i.e. the free energy is not convex and therefore the stability criterion (3.6.48b) is not obeyed, are particularly disturbing. The equation of state deﬁnitely requires modiﬁcation in these regions. We now wish to consider the free energy within the van der Waals theory. As we ﬁnally shall see, an inhomogeneous state containing liquid and gaseous phases has a lower free energy. In Fig. 5.13, a van der Waals isotherm and below it the associated free energy f (T, v) = F (T, V )/N are plotted. Although the lower ﬁgure can be directly read oﬀ from Eq. (5.4.8), it is instructive and useful for further discussion to determine the typical shape of the speciﬁc free energy from the isotherms P (T, v) by integration of P = − ∂f over volume: ∂v T

v f (T, v) = f (T, va ) − va

dv P (T, v ) .

(5.4.15)

248

5. Real Gases, Liquids, and Solutions

Fig. 5.13. A van der Waals isotherm and the corresponding free energy in the dimensionless units P/Pc , v/vc and f /kTc . The free energy of the heterogeneous state (dashed) is lower than the van der Waals free energy (solid curve)

The integration is carried out from an arbitrary initial value va of the speciﬁc volume up to v. We now draw in a horizontal line intersecting the van der Waals isotherm in such a way that the two shaded areas are equal. The pressure which corresponds to this line is denoted by P0 . This construction yields the two volume values v1 and v2 . The values of the free energy at the volumes v1,2 will be denoted by f1,2 = f (T, v1,2 ). At the volumes v1 and v2 , the pressure assumes the value P0 and therefore the slope of f (T, v) at these points has the value −P0 . As a reference for the graphical determination of the free energy, we draw a straight line through (v1 , f1 ) with its slope equal to −P0 (shown as a dashed line). If the pressure had the value P0 throughout the whole interval between v1 and v2 , then the free energy would be f1 − P0 (v − v1 ). We can now readily see that the free energy which is shown in Fig. 5.13 follows from P (T, v), since the van der Waals isotherm to the right of v1 initially falls below the horizontal line P = P0 . Thus the negative integral, i.e. the free energy which corresponds to the van der Waals isotherm, lies above the dashed line. Only when the volume v2 has been reached is f2 ≡ f (T, v2 ) = f1 − P0 (v2 − v1 ), owing to the equal areas which were presupposed in drawing the horizontal line, and the two curves meet = − ∂f , the (dashed) line with slope −P0 is again. Due to P0 = − ∂f ∂v v1 ∂v v2 precisely the double tangent to the curve f (T, v). Since P > P0 for v < v1

5.4 The Van der Waals Equation of State

249

and P < P0 for v > v2 , f in these regions also lies above the double tangent. In Fig. 5.13 we can see that the free calculated in the van der Waals energy ∂2f 1 theory is not convex everywhere, 0 > ∂v2 = − ∂P ∂v = κT ; this violates the thermodynamic inequality (3.3.5). For comparison, we next consider a two-phase, heterogeneous system, −v whose entire material content is divided into a fraction c1 = vv22−v in the 1 v−v1 state (v1 , T ) and a fraction c2 = v2 −v1 in the state (v2 , T ). These states have the same pressure and temperature and can exist in mutual equilibrium. Since the free energy of this inhomogeneous state is given by the linear combination c1 f1 + c2 f2 of f1 and f2 , it lies on the dashed line.8 Thus, the free energy of this inhomogeneous state is lower than that from the van der Waals theory. In the interval [v1 , v2 ] (two-phase region), the substance divides into two phases, the liquid phase with temperature and volume (T, v1 ), and the gas phase with (T, v2 ). The pressure in this interval is P0 . The real isotherm is obtained from the van der Waals isotherm by replacing the S-shaped portion by the horizontal line at P = P0 , which divides the area equally. Outside the interval [v1 , v2 ], the van der Waals isotherm is unchanged. This construction of the equation of state from the van der Waals theory is called the Maxwell construction. The values of v1 and v2 depend on the temperature of the isotherm considered, i.e. v1 = v1 (T ) and v2 = v2 (T ). As T approaches Tc , the interval [v1 (T ), v2 (T )] becomes smaller and smaller; as the temperature decreases below Tc , the interval becomes larger. Correspondingly, the pressure P0 (T ) increases or decreases. In Fig. 5.14, the Maxwell construction for a family of van der Waals isotherms is shown. The points (P0 (T ), v1 (T )) and

Fig. 5.14. Van der Waals isotherms, showing the Maxwell construction and the resulting coexistence curve (heavy curve) in the dimensionless units P/Pc and v/vc , as well as the free energy f 8

c1 + c2 = 1 , v1 c1 + v2 c2 = v , c1 f1 + c2 f2 = c1 f1 + c2 (f1 − P0 (v2 − v1 )) = f1 − P0 (v − v1 ).

250

5. Real Gases, Liquids, and Solutions

(P0 (T ), v2 (T )) form the liquid branch and the gas branch of the coexistence curve (heavy curves in Fig. 5.14). The region within the coexistence curve is called the coexistence region or two-phase region. In this region the isotherms are horizontal, the state is heterogeneous, and it consists of both the liquid and gaseous phases from the two limiting points of the coexistence region. Remarks: (i) In Fig. 5.15, the P V T -surface which follows from the Maxwell construction is shown schematically. The van der Waals equation of state and the conclusions which can be drawn from it are in accord with the general considerations concerning the liquid-gas phase transition in the framework of thermodynamics which we gave in Sect. 3.8.1.

Fig. 5.15. The surface of the equation of state from the van der Waals theory with the Maxwell equal-area construction (schematic). Along with three isotherms at temperatures T1 < Tc < T2 , the coexistence curve (surface) and its projection on the T -V plane are shown

(ii) The chemical potentials µ = f + P v of the two coexisting liquid and gaseous phases are equal. (iii) Kac, Uhlenbeck and Hemmer9 calculated the partition function exactly for a one-dimensional model with an inﬁnite-range potential ( ∞ |x| < x0 v(x) = and κ → 0 . −κe−κ|x| |x| > x0 The result is an equation of state which is qualitatively the same as in the van der Waals theory. In the coexistence region, instead of the S-shaped curve, horizontal isotherms are found immediately. (iv) A derivation of the van der Waals equation for long-range potentials akin to L. S. Ornstein’s, in which the volume is divided up into cells and the most probable occupation number in each cell is calculated, was given by van Kampen10 . The homogeneous and heterogeneous stable states were found. Within the coexistence region, the heterogeneous states – which are described by the horizontal line in the Maxwell construction – are absolutely stable. The two homogeneous states, represented by the S-shaped van der Waals isotherms, are < 0, and describe the superheated liquid and the metastable, as long as ∂P ∂v supercooled vapor. 9 10

M. Kac, G. E. Uhlenbeck and P. C. Hemmer, J. Math. Phys. 4, 216 (1963) N. G. van Kampen, Phys. Rev. 135, A362 (1964)

5.4 The Van der Waals Equation of State

251

5.4.3 The Law of Corresponding States a If one divides the van der Waals equation by Pc = 27b 2 and uses the reduced P v T variables P ∗ = Pc , V ∗ = vc , T ∗ = Tc , then a dimensionless form of the equation is obtained:

P∗ =

3 8T ∗ − . 3V ∗ − 1 V ∗ 2

(5.4.16)

In these units, the equation of state is the same for all substances. Substances with the same P ∗ , V ∗ and thus T ∗ are in corresponding states. Eq. (5.4.16) is called the “law of corresponding states”; it can also be cast in the form P ∗V ∗ = T∗ 3−

8 ·

P∗ T∗

T∗ P ∗V ∗

−

3P ∗ T ∗ . T ∗2 P ∗ V ∗

This means that P ∗ V ∗ /T ∗ as a function of P ∗ yields a family of curves with the parameter T ∗ . All the data from a variety of liquids at ﬁxed T ∗ lie on a single curve (Fig. 5.16). This holds even beyond the validity range of the van der Waals equation. Experiments show that liquids behave similarly when P, V and T are measured in units of Pc , Vc and Tc . This is illustrated for a series of diﬀerent substances in Fig. 5.16.

Fig. 5.16. The law of corresponding states.11

5.4.4 The Vicinity of the Critical Point We now want to discuss the van der Waals equation in the vicinity of its critical point. To do this, we write the results in a form which makes the 11

G. J. Su, Ind. Engng. Chem. analyt. Edn. 38, 803 (1946)

252

5. Real Gases, Liquids, and Solutions

analogy to other phase transitions transparent. The usefulness of this form will become completely clear in connection with the treatment of ferromagnets in the next chapter. The equation of state in the neighborhood of the critical point can be obtained by introducing the variables ∆P = P − Pc ,

∆v = v − vc ,

∆T = T − Tc

(5.4.17)

and expanding the van der Waals equation (5.4.12) in terms of ∆v and ∆T : k(Tc + ∆T ) a − 2b + ∆v (3b + ∆v)2 „ « ∆v “ ∆v ”2 “ ∆v ”3 “ ∆v ”4 k(Tc + ∆T ) 1− = − + ∓ ... + 2b 2b 2b 2b 2b „ « “ “ “ ” ” ” 2 3 ∆v ∆v 4 ∆v ∆v a −4 +5 ∓ ... . − 2 1−2 +3 9b 3b 3b 3b 3b

P =

From this expansion, we ﬁnd the equation of state in the immediate neighborhood of its critical point12 3 (∆v ∗ )3 + . . . ; (5.4.18) 2 it is in this approximation antisymmetric with respect to ∆v ∗ , see Fig. 5.17. ∆P ∗ = 4 ∆T ∗ − 6 ∆T ∗ ∆v ∗ −

Fig. 5.17. The coexistence curve in the vicinity of the critical point. Due to the term 4 ∆T ∗ in the equation of state (5.4.18), the coexistence region is inclined with respect to the V -T plane. The isotherm shown is already so far from the critical point that it is no longer strictly antisymmetric 12

The term ∆T (∆v)2 and especially higher-order terms can be neglected in the leading calculation of the coexistence curve, since it is eﬀectively of order (∆T )2 in comparison to ∼ (∆T )3/2 for the terms which were taken into account. The corrections to the leading critical behavior will be summarized at the end of this section. In Eq. (5.4.18), for clarity we use the reduced variables deﬁned just before Eq. (5.4.16): ∆P ∗ = ∆P/Pc etc.

5.4 The Van der Waals Equation of State

253

The Vapor-Pressure Curve: We obtain the vapor-pressure curve by projecting the coexistence region onto the P -T plane. Owing to the antisymmetry of the van der Waals isotherms with respect to ∆v ∗ in the neighborhood of Tc (Eq. 5.4.18), we can easily determine the location of the two-phase region by setting ∆v ∗ = 0 (cf. Fig. 5.17), ∆P ∗ = 4 ∆T ∗ .

(5.4.19)

The Coexistence Curve: The coexistence curve is the projection of the coexistence region onto the V -T plane. Inserting (5.4.19) into (5.4.18), we obtain the equation 0 = 6 ∆T ∗ ∆v ∗ + 3/2 (∆v ∗ )3 with the solutions ∗ ∆vG = −∆vL∗ = 4(−∆T ∗ ) + O(∆T ∗ ) (5.4.20) for T < Tc . For T < Tc , the substance can no longer occur with a single density, but instead splits up into a less dense gaseous phase and a denser ∗ liquid phase (cf. Sect. 3.8). ∆vG and ∆vL∗ represent the two values of the order parameter for this phase transition (see Chap. 7). The Speciﬁc Heat: T > Tc : From Eq. (5.4.10), the internal energy is found to be E = 32 N kT − aN 2 V . Therefore, the speciﬁc heat at constant volume outside the coexistence region is CV =

3 Nk , 2

(5.4.21a)

as for an ideal gas. We now imagine that we can cool a system with precisely the critical density. Above Tc it has the homogeneous density 1/vc , while L below Tc , it divides into the two fractions (as in (5.4.20)) cG = vvGc −v −vL and vG −vc cL = vG −vL with a gaseous phase and a liquid phase. T < Tc : below Tc , the internal energy is given by

E 3 cG vc + ∆vG + ∆vL 3 cL = kT − a . (5.4.21b) = kT − a + N 2 vG vL 2 (vc + ∆vG )(vc + ∆vL ) If we insert (5.4.20), or, anticipating later results, (5.4.29),13 we obtain

a 9 56 a 3 kT − + k(T − Tc ) + E=N 2 vc 2 25 vc 13

T − Tc Tc

2

+ O (∆T )

5/2

.

With (5.4.20), one ﬁnds only the jump in the speciﬁc heat; in order to determine the linear term in (5.4.21b) as well, one must continue the expansion of vG and ∆vL , Eq. (5.4.27). Including these higher terms, the coexistence curve is not symmetric.

254

5. Real Gases, Liquids, and Solutions

Fig. 5.18. The speciﬁc heat in the neighborhood of the critical point of the van der Waals liquid

The speciﬁc heat

9 28 T − Tc 3 CV = N k + N k 1 + + ... 2 2 25 Tc

for T < Tc

(5.4.21c)

exhibits a discontinuity (see Fig. 5.18). The Critical Isotherm: In order to determine the critical isotherm, we set ∆T ∗ = 0 in (5.4.18). The critical isotherm 3 ∆P ∗ = − (∆v ∗ )3 2

(5.4.22)

is a parabola of third order; it passes through the critical point horizontally, which implies divergence of the isothermal compressibility. The Compressibility: To calculate the isothermal compressibility κT = − V1 ∂V ∂P T , we determine

N

∂P ∗ ∂V ∗

T

9 = −6 ∆T ∗ − (∆v ∗ )2 2

(5.4.23)

from the van der Waals equation (5.4.18). For T > Tc , we ﬁnd along the critical isochores (∆v ∗ = 0) κT =

1 1 Tc 1 . = ∗ 6Pc ∆T 6Pc ∆T

(5.4.24a)

∗ For T < Tc , along the coexistence curve (i.e. ∆v ∗ = ∆vG = −∆vL∗ ), using ∗ ∂P ∗ 2 = −6 ∆T ∗ − 92 (∆vG ) = 24 ∆T ∗, Eq. (5.4.20), we obtain the result N ∂V ∗ T that is

κT =

1 Tc . 12Pc (−∆T )

(5.4.24b)

5.4 The Van der Waals Equation of State

255

The isothermal compressibility diverges in the van der Waals theory above and below the critical temperature as (T − Tc )−1 . The accompanying longrange density ﬂuctuations lead to an increase in light scattering in the forward direction (critical opalescence; see (9.4.51)). Summary: Comparison with experiments shows that liquids in the neighborhood of their critical points exhibit singular behavior, similar to the results described above. The coexistence line obeys a power law; however the exponent is not 1/2, but instead β ≈ 0.326; the speciﬁc heat is in fact divergent, and is characterized by a critical exponent α. The critical isotherm obeys ∆P ∼ ∆v δ and the isothermal compressibility is κT ∼ |T − Tc |−γ . Table 5.2 contains a summary of the results of the van der Waals theory and the power laws which are in general observed in Nature. The exponents β, α, δ, and γ are called critical exponents. The speciﬁc heat shows a discontinuity according to the van der Waals theory, as shown in Fig. 5.18. It is thus of the order of (T − Tc )0 just to the left and to the right of the transition. The index d of the exponent 0 in Table 5.2 refers to this discontinuity. Compare Eq. (7.1.1). Table 5.2. Critical Behavior according to the van der Waals Theory Physical quantity

van der Waals

∆vG = −∆vL cV ∆P κT

∼ (Tc − T ) 2 ∼ (T − Tc )0d ∼ (∆v)3 ∼ |T − Tc |−1

1

Critical behavior (Tc − T )β |Tc − T |−α (∆v)δ |T − Tc |−γ

Temperature range T T T T

< Tc ≷ Tc = Tc ≷ Tc

The Latent Heat Finally, we will determine the latent heat just below the critical temperature. The latent heat can be written using the Clausius– Clapeyron equation (3.8.8) in the form: q = T (sG − sL ) = T

∂P0 ∂P0 (vG − vL ) = T (∆vG − ∆vL ) . ∂T ∂T

Here, sG and sL refer to the entropies per particle of the gas and liquid 0 phases and ∂P ∂T is the slope of the vaporization curve at the corresponding 0 point. In the vicinity of the critical point, to leading order we can set T ∂P ∂T ≈ ∂P0 Tc ∂T c.p. , where (∂P0 /∂T )c.p. is the slope of the evaporation curve at the critical point.

∂P q = 2Tc ∆vG . (5.4.25) ∂T c.p.

256

5. Real Gases, Liquids, and Solutions

The slope of the vapor-pressure curve at Tc is ﬁnite (cf. Fig. 5.17 and Eq. (5.4.19)). Thus the latent heat decreases on approaching Tc according to the same power law as the order parameter, i.e. q ∝ (Tc − T )β ; in the van der Waals theory, β = 12 . 2 1 By means of the thermodynamic relation (3.2.24) CP −CV = −T ∂P ∂T V ∂P the critical behavior of the speciﬁc heat at ∂V T , we can also determine constant pressure. Since ∂P ∂T V is ﬁnite, the right-hand side behaves like the isothermal compressibility κT , and because CV is only discontinuous or at most weakly singular, it follows in general that CP ∼ κT ∝ (T − Tc )−γ ;

(5.4.26)

for a van der Waals liquid, γ = 1. ∗

Higher-Order Corrections to Eq. (5.4.18) For clarity, we use the reduced quantities deﬁned in (5.4.16). Then the van der Waals equation becomes ”` ` ´2 “ 3 ´3 27 ∆P ∗ = 4∆T ∗ − 6∆T ∗ ∆v ∗ + 9∆T ∗ ∆v ∗ − + ∆T ∗ ∆v ∗ 2 2 “ 21 “` ”` ”` ´4 “ 99 ´5 ´6 ” 81 243 + + ∆T ∗ ∆v ∗ + + ∆T ∗ ∆v ∗ + O ∆v ∗ . 4 4 8 8 (5.4.27) ∗ and the vapor-pressure curve, which we denote here The coexistence curve ∆vG/L ∗ ∗ by ∆P0 (∆T ), are found from the van der Waals equation: ` ` ´ ∗´ ∆P ∗ ∆T ∗ , ∆vG = ∆P ∗ ∆T ∗ , ∆vL∗ = 0

with the Maxwell construction ∗ ∆vG

Z

` ´ ´` d ∆v ∗ ∆P ∗ − ∆P0∗ (∆T ∗ ) = 0 .

∗ ∆vL

For the vapor-pressure curve in the van der Waals theory, we obtain “` ´2 ´5/2 ” 24 ` −∆T ∗ + O −∆T ∗ , ∆P0∗ = 4∆T ∗ + 5 and for the coexistence curve: √ ´ ` ´3/2 ´ ` 18 ` ∗ −∆T ∗ + X −∆T ∗ = 2 −∆T ∗ + + O (∆T ∗ )2 ∆vG 5 √ ` ´ ` ´3/2 ´ 18 ` −∆T ∗ + Y −∆T ∗ ∆vL∗ = −2 −∆T ∗ + + O (∆T ∗ )2 5

(5.4.28)

(5.4.29)

, see problem 5.6). In contrast to the ferromagnetic phase (with X − Y = 294 25 transition, the order parameter is not exactly symmetric; instead, it is symmetric only near Tc , compare Eq. (5.4.20).

5.5 Dilute Solutions The internal energy is: „ “ ”« ´2 56 ` 3 a E 5/2 1 − 4∆T ∗ − = kT − ∆T ∗ + O |∆T ∗ | N 2 vc 25 and the heat capacity is: „ “ ”« 3 9 28 3/2 CV = N k + N k 1 − . |∆T ∗ | + O |∆T ∗ | 2 2 25

257

(5.4.30)

(5.4.31)

For the calculation of the speciﬁc heat, only the diﬀerence X − Y = 294/25 enters. The vapor-pressure curve is no longer linear in ∆T ∗ , and the coexistence curve is no longer symmetric with respect to the critical volume.

5.5 Dilute Solutions 5.5.1 The Partition Function and the Chemical Potentials We consider a solution where the solvent consists of N particles and the solute of N atoms (molecules), so that the concentration is given by c=

N 1. N

We shall calculate the properties of such a solution by employing the grand partition function14 ZG (T, V, µ, µ ) =

∞

Zn (T, V, µ)z

n

n =0

2 = Z0 (T, V, µ) + z Z1 (T, V, µ) + O z .

(5.5.1)

It depends upon the chemical potentials of the solvent, µ, and of the solute, µ . Since the solute is present only at a very low concentration, we have µ 0 and therefore the fugacity z = eµ /kT 1. In (5.5.1), Z0 (T, V, µ) means the grand partition function of the pure solvent and Z1 (T, V, µ) that of the solvent and a dissolved molecule. From these expressions we ﬁnd for the total pressure −P =

2 Φ kT =− log ZG = ϕ0 (T, µ) + z ϕ1 (T, µ) + O z , V V

(5.5.2)

kT Z1 where ϕ0 = − kT V log Z0 and ϕ1 = − V Z0 . In (5.5.2), ϕ0 (T, µ) is the contribution of the pure solvent and the second term is the correction due to 14

P −β(Hn +Hn +Wn n −µn) , where Tr Here, Zn (T, V, µ) = ∞ n and Trn n=0 Trn Trn e refer to the traces over n- and n -particle states of the solvent and the solute, respectively. The Hamiltonians of these subsystems and their interactions are denoted by Hn , Hn and Wn n .

258

5. Real Gases, Liquids, and Solutions

the dissolved solute. Here, Z1 and therefore ϕ1 depend on the interactions of the dissolved molecules with the solvent, but not however on the mutual interactions of the dissolved molecules. We shall now express the chemical potential µ in terms of the pressure. To this end, we use the inverse function −1 ϕ−1 0 at ﬁxed T , i.e. ϕ0 (T, ϕ0 (T, µ)) = µ, obtaining µ = ϕ−1 0 (T, −P − z ϕ1 (T, µ)) = ϕ−1 0 (T, −P ) − z

2 ϕ1 (T, ϕ−1 (T, −P )) 0 + O z . ∂ϕ0 −1

(5.5.3)

∂µ µ=ϕ0 (T,−P )

The (mean) particle numbers are ∂ϕ0 (T, µ) ∂Φ = −V + O(z ) ∂µ ∂µ 2 ∂Φ z V ϕ1 (T, µ) + O z . N = − = − ∂µ kT N =−

(5.5.4a) (5.5.4b)

Inserting this into (5.5.3), we ﬁnally obtain µ(T, P, c) = µ0 (T, P ) − kT c + O(c2 ) ,

(5.5.5)

where µ0 (T, P ) ≡ ϕ−1 0 (T, −P ) is the chemical potential of the pure solvent as a function of T and P . From (5.5.4b) and (5.5.4a), we ﬁnd for the chemical potential of the solute:

2 −N kT µ = kT log z = kT log + O z V ϕ1 (T, µ) (5.5.6) (T,µ) N kT ∂ϕ0∂µ + O(z ) ; = kT log N ϕ1 (T, µ) and ﬁnally, using (5.5.5), µ (T, P, c) = kT log c + g(T, P ) + O(c) .

(5.5.7)

In the function g(T, P ) = kT log(kT /υ0 (T, P )ϕ1 (T, µ0 (T, P ))), which depends only on the thermodynamic variables T and P , the interactions of the dissolved molecules with the solvent also enter. The simple dependences of the chemical potentials on the concentration are valid so long as one chooses T and P as independent variables. From (5.5.5), we can calculate the pressure as a function of T and µ. To do this, we use P0 (T, µ), the inverse function of µ0 (T, P ), and rewrite (5.5.5) as follows: µ = µ0 (T, P0 (T, µ) + (P − P0 (T, µ))) − kT c ; we then expand in terms of P −P0 (T, µ) and use the fact that µ0 (T, P0 (T, µ)) = µ holds for the pure solvent:

5.5 Dilute Solutions

µ= µ+

∂µ0 ∂P

259

(P − P0 (T, µ)) − kT c . T

From the Gibbs-Duhem relation, we know that

∂µ0 ∂P

T

= v0 (P, T ) = v +

O(c ), from which it follows that c (5.5.8) P = P0 (T, µ) + kT + O(c2 ) , v where v is the speciﬁc volume of the solvent. The interactions of the dissolved atoms with the solvent do not enter into P (T, µ, c) and µ(T, P, c) to the order we are considering, although we have not made any constraining assumptions about the nature of the interactions. 2

∗ An Alternate Derivation of (5.5.6) and (5.5.7) in the Canonical Ensemble We again consider a system with two types of particles which are present in the amounts (particle numbers) N and N , where the concentration of the latter type, c = NN 1, is very small. The mutual interactions of the dissolved atoms can be neglected in dilute solutions. The interaction of the solvent with the solute is denoted by WN N . Furthermore, the solute is treated classically. We initially make no assumptions regarding the solvent; in particular, it can be in any phase (solid, liquid, gaseous). The partition function of the overall system then takes the form Z dΓN e−(HN +WN N )/kT Z = Tr e−HN /kT N ! h3N ﬁZ ﬂ (5.5.9a) “ ” 1 3 3 −(VN +WN N )/kT e d , = Tr e−HN /kT x . . . d x 1 N N ! λ 3N where λ is the thermal wavelength of the dissolved substance. HN and HN are the Hamiltonians for the solvent and the solute molecules, VN denotes the interactions of the solute molecules, and WN N is the interaction of the solvent with the solute. A conﬁgurational contribution also enters into (5.5.9a): Z D E Zconf = d3 x1 . . . d3 xN e−(VN +WN N )/kT (5.5.9b) R 3 d x1 . . . d3 xN Tr e−HN /kT e−(VN +WN N )/kT . ≡ Tr e−HN /kT The trace runs over all the degrees of freedom of the solvent. When the latter must be treated quantum-mechanically, WN N also contains an additional contribution due to the nonvanishing commutator of HN and the interactions. VN depends on the {x } and WN N on the {x } and {x} (coordinates of the solute molecules and the solvent). We assume that the interactions are short-ranged; then VN can be neglected for all the typical conﬁgurations of the dissolved solute molecules: E D E D e−(VN +WN N )/kT ≈ e−WN N /kT D

E

2 2 2 (WN − W /kT + 1 N −WN N ) /(kT ) ±... 2 = e N N ´ D E P ` Wn N 2 2 1 − N − 2 (Wn N −Wn N ) ±... kT n =0 2(kT ) =e

= e−N

ψ(T,V /N)

.

(5.5.9c)

260

5. Real Gases, Liquids, and Solutions

Here, Wn N denotes the interaction of molecule n with the N molecules of the solvent. In Eq. (5.5.9c), a cumulant expansion was carried out and we have taken into account that the overlap of the interactions of diﬀerent molecules vanishes for all of the typical conﬁgurations. Owing to translational invariance, the expectation values Wn N etc. are furthermore independent of x and are the same for all n . We thus ﬁnd for each of the dissolved molecules a factor e−ψ(T,V /N) , where ψ depends on the temperature and the speciﬁc volume of the solvent. It follows from (5.5.9c) that the partition function (5.5.9a) is «N ” 1 „V “ ψ(T, V /N ) . Z = Tr e−HN /kT N ! λ 3

(5.5.10)

This result has the following physical meaning: the dissolved molecules behave like an ideal gas. They are subject at every point to the same potential from the surrounding solvent atoms, i.e. they are moving in a position-independent eﬀective potential kT ψ(T, V /N ), whose value depends on the interactions, the temperature, and the density. The free energy therefore assumes the form F (T, V, N, N ) = F0 (T, V, N ) − kT N log

eV − N γ(T, V /N ) , N λ 3

(5.5.11)

where F0 (T, V ) = −kT log Tr e−HN /kT is the free energy of the pure solvent and γ(T, V /N ) = kT log ψ(T, V /N ) is due to the interactions of the dissolved atoms with the solvent. From (5.5.11), we ﬁnd for the pressure « « „ „ ∂ kT N ∂F + N γ = P0 (T, V /N ) + P =− ∂V T,N,N V ∂V T,N (5.5.12) „ « kT c ∂ = P0 (T, v) + , +c γ(T, v) v ∂v T

V were employed. where c = NN and v = N We could calculate the chemical potentials from (5.5.11) as functions of T and v. In practice, however, one is usually dealing with physical conditions which ﬁx the pressure instead of the speciﬁc volume. In order to obtain the chemical potentials as functions of the pressure, it is expedient to use the free enthalpy (Gibbs free energy). It is found from (5.5.11) and (5.5.12) to be « „ „ « eV ∂γ , (5.5.13) G = F +P V = G0 (T, P, N )−kT N log 3 − 1 −N γ − V ∂V N λ

where P0 (T, v) and G0 (T, P, N ) are the corresponding quantities for the pure solvent. From Equation (5.5.12), one can compute v as a function of P, T and c, ` ´ v = v0 (T, P ) + O N /N . If we insert this in (5.5.13), we ﬁnd an expression for the free enthalpy of the form „ 2 « ” “ N N , (5.5.14) G(T, P, N, N ) = G0 (T, P, N )−kT N log −1 +N g(T, P )+O N N ˛ “ ´”˛ ` ∂γ v where g(T, P ) = −kT log λ3 − γ − V ∂V ˛˛ . Now we can compute the v=v0 (T,P )

two chemical potentials ` ∂Gas´ functions of T, P and c. For the chemical potential of the , the result to leading order in the concentration is solvent, µ(T, P, c) = ∂N T,P,N

5.5 Dilute Solutions ` ´ µ(T, P, c) = µ0 (T, P ) − kT c + O c2 . For the chemical potential of the solute, we ﬁnd from (5.5.14) „ « ` ´ ∂G 1 = −kT log + g(T, P ) + O c . µ (T, P, c) = ∂N N,P,T c

261 (5.5.15)

(5.5.16)

The results (5.5.15) and (5.5.16) agree with those found in the framework of the grand canonical ensemble (5.5.5) and (5.5.7).

5.5.2 Osmotic Pressure We let two solutions of the same substances (e.g. salt in water) be separated by a semipermeable membrane (Fig. 5.19). An example of a semipermeable membrane is a cell membrane.

Fig. 5.19. A membrane which allows only the solvent to pass through (= semipermeable) separates the two solutions. · = solvent, • = solute; concentrations c1 and c2

The semipermeable membrane allows only the solvent to pass through. Therefore, in chambers 1 and 2, there will be diﬀerent concentrations c1 and c2 . In equilibrium, the chemical potentials of the solvent on both sides of the membrane are equal, but not those of the solute. The osmotic pressure is deﬁned by the pressure diﬀerence ∆P = P1 − P2 . From (5.5.8), we can calculate the pressure on both sides of the membrane, and since in equilibrium, the chemical potentials of the solvent are equal, µ1 = µ2 , it follows that the pressure diﬀerence is ∆P =

c 1 − c2 kT . v

(5.5.17)

The van’t Hoﬀ formula is obtained as a special case for c2 = 0, c1 = c, when only the pure solvent is present on one side of the membrane: ∆P =

N c kT = kT . v V

(5.5.17 )

Here, N refers to the number of dissolved molecules in chamber 1 and V to its volume.

262

5. Real Gases, Liquids, and Solutions

Notes: (i) Equation (5.5.17 ) holds for small concentrations independently of the nature of the solvent and the solute. We point out the formal similarity between the van’t Hoﬀ formula (5.5.17) and the ideal gas equation. The osmotic pressure of a dilute solution of n moles of the dissolved substance is equal to the pressure that n moles of an ideal gas would exert on the walls of the overall volume V of solution and solvent. (ii) One can gain a physical understanding of the origin of the osmotic pressure as follows: the concentrated part of the solution has a tendency to expand into the less concentrated region, and thus to equalize the concentrations. (iii) For an aqueous solution of concentration c = 0.01, the osmotic pressure at room temperature amounts to ∆P = 13.3 bar. ∗

5.5.3 Solutions of Hydrogen in Metals (Nb, Pd,...)

We now apply the results of Sect. 5.5.1 to an important practical example, the solution of hydrogen in metals such as Nb, Pd,. . . (Fig. 5.20). In the gas phase, hydrogen occurs in molecular form as H2 , while in metals, it dissociates. We thus have a case of chemical equilibrium, see Sect. 3.9.3.

Fig. 5.20. Solution of hydrogen in metals: atomic hydrogen in a metal is represented by a dot, while molecular hydrogen in the surrounding gas phase is represented by a pair of dots.

The chemical potential of molecular hydrogen gas is kT V = −kT log + log Z + log Zi µH2 = −kT log i N λ3H2 P λ3H2

, (5.5.18)

where Zi also contains the electronic contribution to the partition function (Eq. (5.1.5c)). The chemical potential of atomic hydrogen dissolved in a metal is, according to Eq. (5.5.7), given by µH = kT log c + g(T, P ) .

(5.5.19)

The metals mentioned can be used for hydrogen storage. The condition for chemical equilibrium (3.9.26) is in this case 2µH = µH2 ; this yields the equilibrium concentration:

5.5 Dilute Solutions

c = e(µH2 /2−g(T,P ))/kT =

P λ3H2 kT

1 2

− 12

Zi

exp

−2g(T, P ) + εel 2kT

263

. (5.5.20)

Since g(T, P ) depends only weakly on P , the concentration of undissolved 1 hydrogen is c ∼ P 2 . This dependence is known as Sievert’s law . 5.5.4 Freezing-Point Depression, Boiling-Point Elevation, and Vapor-Pressure Reduction Before we turn to a quantitative treatment of freezing-point depression, boiling-point elevation, and vapor-pressure reduction, we begin with a qualitative discussion of these phenomena. The free enthalpy of the liquid phase of a solution is lowered, according to Eq. (5.5.5), relative to its value in the pure solvent, an eﬀect which can be interpreted in terms of an increase in entropy. The free enthalpies of the solid and gaseous phases remain unchanged. In Fig. 5.21, G(T, P ) is shown qualitatively as a function of the temperature and the pressure, keeping in mind its convexity, and assuming that the dissolved substance is soluble only in the liquid phase. The solid curve describes the pure solvent, while the change due to the dissolved substance is described by the chain curve. As a rule, the concentration of the solute in the liquid phase is largest and the associated entropy increase leads to a reduction of the free enthalpy. From these two diagrams, the depression of the freezing point, the elevation of the boiling point, and the reduction in the vapor pressure can be read oﬀ.

Fig. 5.21. The change in the free enthalpy on solution of a substance which dissolves to a notable extent only in the liquid phase. The solid curve is for the pure solvent, the chain curve for the solution. We can recognize the freezing-point depression, the boiling-point elevation, and the vapor-pressure reduction

Next we turn to the analytic treatment of these phenomena. We ﬁrst consider the melting process. The concentrations of the dissolved substance

264

5. Real Gases, Liquids, and Solutions

in the liquid and solid phases are cL and cS .15 The chemical potentials of the solvent in the liquid and the solid phase are denoted by µL and µS , and correspondingly in the pure system by µL0 and µS0 . From Eq. (5.5.5), we ﬁnd that µL = µL0 (P, T ) − kT cL and µS = µS0 (P, T ) − kT cS . In equilibrium, the chemical potentials of the solvent must be equal, µL = µS , from which it follows that16 µL0 (P, T ) − kT cL = µS0 (P, T ) − kT cS .

(5.5.21)

For the pure solute, we obtain the melting curve, i.e. the relation between the melting pressure P0 and the melting temperature T0 , from µL0 (P0 , T0 ) = µS0 (P0 , T0 ) .

(5.5.22)

Let (P0 , T0 ) be a point on the melting curve of the pure solvent. Then consider a point (P, T ) on the melting curve which obeys (5.5.21), and which is shifted relative to (P0 , T0 ) by ∆P and ∆T , that is P = P0 + ∆P ,

T = T0 + ∆T .

If we expand Eq. (5.5.21) in terms of ∆P and ∆T , and use (5.5.22), we ﬁnd the following relation ∂µL0 ∂µL0 ∂µS0 ∂µS0 ∆P + ∆T − kT cL = ∆P + ∆T − kT cS . (5.5.23) ∂P 0 ∂T 0 ∂P 0 ∂T 0 We now recall that G = µN = E − T S + P V , and using it we obtain

∂µ ∂P

dG = −SdT + V dP + µdN = d(µN ) = µdN + N dµ ,

∂µ V S = v, = = − = −s . N ∂T P,N N T,N

The derivatives in (5.5.23) can therefore be expressed in terms of the volumes per molecule vL and vS , and the entropies per molecule sL and sS in the liquid and solid phases of the pure solvent, 15

16

Since two phases and two components are present, the number of degrees of freedom is two (Gibbs’ phase rule). One can for example ﬁx the temperature and one concentration; then the other concentration and the pressure are determined. The chemical potentials of the solute must of course also be equal. From this fact, we can for example express the concentration in the solid phase, cS , in terms of T and cL . We shall, however, not need the exact value of cS , since cS cL is negligible.

5.5 Dilute Solutions

−(sS − sL )∆T + (vS − vL )∆P = (cS − cL )kT .

265

(5.5.24)

Finally, we introduce the heat of melting q = T (sL − sS ), thus obtaining q ∆T + (vS − vL )∆P = (cS − cL )kT . T

(5.5.25)

The change in the transition temperature ∆T at a given pressure is obtained from (5.5.25), by setting P = P0 or ∆P = 0: ∆T =

kT 2 (cS − cL ) . q

(5.5.26)

As a rule, the concentration in the solid phase is much lower than that in the liquid phase, i.e. cS cL ; then (5.5.26) simpliﬁes to ∆T = −

kT 2 cL < 0 . q

(5.5.26 )

Since the entropy of the liquid is larger, or on melting, heat is absorbed, it follows that q > 0. As a result, the dissolution of a substance gives rise to a freezing-point depression. Note: On solidiﬁcation of a liquid, at ﬁrst (5.5.26 ) holds, with the initial concentration cL . Since however pure solvent precipitates out in solid form, the concentration cL increases, so that it requires further cooling to allow the freezing process to continue. Freezing of a solution thus occurs over a ﬁnite temperature interval.

The above results can be transferred directly to the evaporation process. To do this, we make the replacements L→G, S→L and obtain from (5.5.25) for the liquid phase (L) and the gas phase (G) the relation q ∆T + (vL − vG )∆P = (cL − cG )kT . T

(5.5.27)

Setting ∆P = 0 in (5.5.27), we ﬁnd ∆T =

kT 2 kT 2 (cL − cG ) ≈ cL > 0 , q q

(5.5.28)

a boiling-point elevation. In the last equation, cL cG was assumed (this no longer holds near the critical point). Setting ∆T = 0 in (5.5.27), we ﬁnd ∆P =

cL − c G c L − cG kT ≈ − kT , vL − vG vG

(5.5.29)

266

5. Real Gases, Liquids, and Solutions

a vapor-pressure reduction. When the gas phase contains only the vapor of the pure solvent, (5.5.29) simpliﬁes to ∆P = −

cL kT . vG

(5.5.30)

Inserting the ideal gas equation, P vG = kT , we have ∆P = −cL P = −cL (P0 + ∆P ) . Rearrangement of the last equation yields the relative pressure change: ∆P cL =− ≈ −cL , P0 1 + cL

(5.5.31)

known as Raoult’s law . The relative vapor-pressure reduction increases linearly with the concentration of the dissolved substance. The results derived here are in agreement with the qualitative considerations given at the beginning of this subsection.

Problems for Chapter 5 5.1 The rotational motion of a diatomic molecule is described by the angular variables ϑ and ϕ and the canonically conjugate momenta pϑ and pϕ with the p2

2 1 Hamilton function H = 2Iϑ + 2I sin 2 ϑ pϕ . Calculate the classical partition function for the canonical ensemble. (see footnote 4 to Eq. (5.1.15)). Result: Zrot = 2T Θr

5.2 Conﬁrm the formulas (5.4.13) for the critical pressure, the critical volume, and the critical temperature of a van der Waals gas and the expansion (5.4.18) of P (T, V ) around the critical point up to the third order in ∆v. 5.3 The expansion of the van der Waals equation in the vicinity of the critical point: (a) Why is it permissible in the determination of the leading order to leave oﬀ the 3 term ∆T (∆V )2 in comparison `to (∆V ´ ) ? (b) Calculate the correction O ∆T to the coexistence curve. ´ ` (c) Calculate the correction O (T − Tc )2 to the internal energy. 5.4 The equation of state for a van der Waals gas is given in terms of reduced variables in Eq. (5.4.16). Calculate the position of the inversion points (Chap. 3) in the p∗ , T ∗ diagram. Where is the maximum of the curve? 5.5 Calculate the jump in the speciﬁc heat cv for a van der Waals gas at a speciﬁc volume of v = vc . 5.6 Show in general and for the van der Waals equation that κs and cv exhibit the same behavior for T → Tc .

Problems for Chapter 5

267

5.7 Consider two metals 1 and 2 (with melting points T1 , T2 and temperature independent heats of melting q1 , q2 ), which form ideal mixtures in the liquid phase (i.e. as for small concentrations over the whole concentration range). In the solid phase these metals are not miscible. Calculate the eutectic point TE (see also Sect. 3.9.2). Hint: Set up the equilibrium conditions between pure solid phase 1 or 2 and the liquid phase. From these, the concentrations are determined: « „ Ti qi ; i = 1, 2 1− ci = eλi , where λi = kTi T using ∂(G/T ) = −H/T 2 , ∂T

qi = ∆Hi ,

G = µN .

5.8 Apply the van’t Hoﬀ formula (5.5.17 ) to the following simple example: the concentration of the dissolved substance is taken to be c = 0.01, the solvent is water (at 20◦ C); use ρH2 O = 1 g/cm3 (20◦ C). Find the osmotic pressure ∆P .

6. Magnetism

In this chapter, we will deal with the fundamental phenomenon of magnetism. We begin the ﬁrst section by setting up the density matrix, starting from the Hamiltonian, and using it to derive the thermodynamic relations for magnetic systems. Then we continue with the treatment of diamagnetic and paramagnetic substances (Curie and Pauli paramagnetism). Finally, in Sect. 6.5.1, we investigate ferromagnetism. The basic properties of magnetic phase transitions will be studied in the molecular-ﬁeld approximation (Curie–Weiss law, Ornstein-Zernike correlation function, etc.). The results obtained will form the starting point for the renormalization group theory of critical phenomena which is dealt with in the following chapter.

6.1 The Density Matrix and Thermodynamics 6.1.1 The Hamiltonian and the Canonical Density Matrix We ﬁrst summarize some facts about magnetic properties as known from electrodynamics and quantum mechanics. The Hamiltonian for N electrons in a magnetic ﬁeld H = curl A is: H=

N 2 1 e pi − A (xi ) − µspin · H (xi ) + WCoul . i 2m c i=1

(6.1.1)

The index i enumerates the electrons. The canonical momentum of the ith electron is pi and the kinetic momentum is mvi = pi − ec A (xi ). The charge and the magnetic moment are given by1 e = −e0 ,

µspin =− i

geµB Si ,

(6.1.2a)

where along with the elementary charge e0 , the Bohr magneton µB = 1

erg J e0 = 0.927 · 10−20 = 0.927 · 10−23 2mc Gauss T

QM I, p. 186

(6.1.2b)

270

6. Magnetism

as well as the Land´e-g-factor or the spectroscopic splitting factor of the electron ge = 2.0023

(6.1.2c)

ege were introduced. The quantity γ = 2mc = − geµB is called the magnetomechanical ratio or gyromagnetic ratio. The last term in (6.1.1) stands for the Coulomb interaction of the electrons with each other and with the nuclei. The dipole-dipole interaction of the spins is neglected here. Its consequences, such as the demagnetizing ﬁeld, will be considered in Sect. 6.6; see also remark (ii) at the end of Sect. 6.6.3. We assume that the magnetic ﬁeld H is produced by some external sources. In vacuum, B = H holds. We use here the magnetic ﬁeld H, corresponding to the more customary practice in the literature on magnetism. The current-density operator is thus given by2 N " δH e e = pi − A (xi ) , δ (x − xi ) δA (x) 2m c + i=1 0 +c curl µspin δ (x − xi ) i

j (x) ≡ −c

(6.1.3) with [A, B]+ = AB + BA. The current density contains a contribution from the electronic orbital motion and a spin contribution. For the total magnetic moment , one obtains3,4 : µ≡

1 2c

d3 x x × j (x) =

N / 0 e e xi × pi − A (xi ) + µspin . (6.1.4) i 2mc c i=1

When H is uniform, Eq. (6.1.4) can also be written in the form µ=−

∂H . ∂H

(6.1.5)

The magnetic moment of the ith electron for a uniform magnetic ﬁeld (see Remark (iv) in Sect. 6.1.3) is – according to Eq. (6.1.4) – given by e e2 Li − xi × A (xi ) 2mc 2mc2 e2 e (Li + ge Si ) − H x2i − xi (xi · H) . = 2 2mc 4mc

µi = µspin + i

2

3

4

(6.1.6)

The intermediate steps which lead to (6.1.3)–(6.1.5) will be given at the end of this section. J. D. Jackson, Classical Electrodynamics, 2nd edition, John Wiley and sons, New York, 1975, p. 18. Magnetic moments are denoted throughout by µ, except for the spin magnetic moments of elementary particles which are termed µspin .

6.1 The Density Matrix and Thermodynamics

271

If H = Hez , then (for a single particle) it follows that µi z =

∂H e e2 H 2 (Li + ge Si )z − , xi + yi2 = − 2 2mc 4mc ∂H

and the Hamiltonian is5 H=

N " 2 p

# e e2 H 2 2 2 − (Li + 2Si )z H + x + yi + WCoul . (6.1.7) 2m 2mc 8mc2 i i

i=1

Here, we have used ge = 2. We now wish to set up the density matrices for magnetic systems; we can follow the steps in Chap. 2 to do this. An isolated magnetic system is described by a microcanonical ensemble, ρMC = δ (H − E) /Ω (E, H)

with Ω (E, H) = Tr δ(H − E),

where, for the Hamiltonian, (6.1.1) is to be inserted. If the magnetic system is in contact with a heat bath, with which it can exchange energy, then one ﬁnds for the magnetic subsystem, just as in Chap. 2, the canonical density matrix ρ=

1 −H/kT e . Z

(6.1.8)

The normalization factor is given by the partition function Z = Tr e−H/kT .

(6.1.9a)

The canonical parameters (natural variables) are here the temperature, whose reciprocal is deﬁned as in Chap. 2 in the microcanonical ensemble as the derivative of the entropy of the heat bath with respect to its energy, and the external magnetic ﬁeld H.6 Correspondingly, the canonical free energy, F (T, H) = −kT log Z ,

(6.1.9b)

is a function of T and H. The entropy S and the internal energy E are, by deﬁnition, calculated from S = −k log ρ = 5 6

1 (E − F ) , T

(6.1.10)

See e.g. QM I, Sect. 7.2. In this chapter we limit our considerations to magnetic eﬀects. Therefore, the particle number and the volume are treated as ﬁxed. For phenomena such as magnetostriction, it is necessary to consider also the dependence of the free energy on the volume and more generally on the deformation tensor of the solid (see also the remark in 6.1.2.4).

272

6. Magnetism

and E = H .

(6.1.11)

The magnetic moment of the entire body is deﬁned as the thermal average of the total quantum-mechanical magnetic moment ∂H . (6.1.12) M ≡ µ = − ∂H The magnetization M is deﬁned as the magnetic moment per unit volume, i.e. for a uniformly magnetized body 1 M V and, in general, M = d3 x M(x) . M=

(6.1.13a)

(6.1.13b)

For the diﬀerential of F , we ﬁnd from (6.1.9)−(6.1.10) dF = (F − E)

dT − M · dH ≡ −SdT − M · dH , T

that is

∂F = −S ∂T H

and

∂F ∂H

(6.1.14a)

= −M.

(6.1.14b)

T

Using equation (6.1.10), one can express the internal energy E in terms of F and S and obtain from (6.1.14a) the First Law for magnetic systems: dE = T dS − MdH .

(6.1.15)

The internal energy E contains the interaction of the magnetic moments with the magnetic ﬁeld (see (6.1.7)). Compared to a gas, we have to make the following formal replacements in the First Law: V → H, P → M. Along with the (canonical) free energy F (T, H), we introduce also the Helmholtz free energy7 A (T, M) = F (T, H) + M · H .

(6.1.16)

Its diﬀerential is dA = −SdT + HdM , i.e.

7

∂A ∂T

M

(6.1.17a)

= −S

and

∂A ∂M

=H.

(6.1.17b)

T

The notation of the magnetic potentials is not uniform in the literature. This is true not only of the choice of symbols; even the potential F (T, H), which depends on H, is sometimes referred to as the Helmholtz free energy.

6.1 The Density Matrix and Thermodynamics

273

6.1.2 Thermodynamic Relations ∗

6.1.2.1 Thermodynamic Potentials

At this point, we summarize the deﬁnitions of the two potentials introduced in the preceding subsection. The following compilation, which indicates the systematic structure of the material, can be skipped over in a ﬁrst reading: F = F (T, H) = E − T S ,

dF = −SdT − M dH

A = A(T, M) = E − T S + M · H ,

dA = −SdT + H dM . (6.1.18b)

(6.1.18a)

In comparison to liquids, the thermodynamic variables here are T, H and M instead of T , P and V . The thermodynamic relations listed can be read oﬀ from the corresponding relations for liquids by making the substitutions V → −M and P → H. There is also another analogy between magnetic systems and liquids: the density matrix of the grand potential contains the term −µN , which in a magnetic system corresponds to −H·M. Particularly in the low-temperature region, where the properties of a magnetic system can be described in terms of spin waves (magnons), this analogy is useful. There, the value of the magnetization is determined by the number of thermally-excited spin waves. Therefore, we ﬁnd the correspondence M ↔ N and H ↔ µ. Of course the Maxwell relations follow from (6.1.15) and (6.1.18a,b)

∂T ∂M ∂S ∂M =− , = . (6.1.19) ∂H S ∂S H ∂H T ∂T H ∗

6.1.2.2 Magnetic Response Functions, Speciﬁc Heats, and Susceptibilities Analogously to the speciﬁc heats of liquids, we deﬁne here the speciﬁc heats CM and CH (at constant M and H) as8

2 ∂S ∂ A CM ≡ T = −T (6.1.20a) ∂T M ∂T 2 M

2 ∂S ∂E ∂ F = = −T . (6.1.20b) CH ≡ T ∂T H ∂T H ∂T 2 H Instead of the compressibilities as for liquids, in the magnetic case one has the isothermal susceptibility

1 ∂2F ∂M χT ≡ =− (6.1.21a) ∂H T V ∂H 2 T 8

To keep the notation simple, we will often write H and M as H and M , making the assumption that M is parallel to H and that H and M are the components in the direction of H.

274

6. Magnetism

and the adiabatic susceptibility

∂M 1 ∂ 2E χS ≡ = . ∂H S V ∂H 2 S

(6.1.21b)

In analogy to Chap. 3, one ﬁnds that CH − CM = T V α2H /χT , χT − χ S =

(6.1.22a)

T V α2H /CH

(6.1.22b)

and CH χT = . CM χS

(6.1.22c)

Here, we have deﬁned

∂M αH ≡ . ∂T H

(6.1.23)

Eq. (6.1.22a) can also be rewritten as CH − CM = T V α2M χT ,

(6.1.22d)

where

αM =

∂H ∂T

=− M

αH χT

(6.1.22e)

was used. ∗

6.1.2.3 Stability Criteria and the Convexity of the Free Energy

One can also derive inequalities of the type (3.3.5) and (3.3.6) for the magnetic susceptibilities and the speciﬁc heats: χT ≥ 0 ,

CH ≥ 0

and

CM ≥ 0 .

(6.1.24a,b,c)

To derive these inequalities on a statistical-mechanical basis, we assume that the Hamiltonian has the form H = H0 − µ · H ,

(6.1.25)

where H thus enters only linearly and µ commutes with H. It then follows that

∂ Tr e−βH µ 1 ∂µ 1 β 2 χT = (µ − µ) ≥ 0 (6.1.26a) = = V ∂H T V ∂H Tr e−βH T V

6.1 The Density Matrix and Thermodynamics

275

and

CH =

∂ H ∂T

=

H

∂ Tr e−βH H ∂T Tr e−βH

= H

1 2 (H − H) ≥ 0 , 2 kT (6.1.26b)

with which we have demonstrated (6.1.24a) and (6.1.24b). Eq. (6.1.24c) can be shown by taking the second derivative of A(T, M) = F (T, H) + HM with respect to the temperature at constant M (problem 6.1). As a result, 9 F (T, H) in T and in H,while A(T, M) is concave in T and convex is2concave ∂H CM ∂ A ∂2A =− T ≤0 , = ∂M T = 1/χT ≥ 0 . in M: ∂T 2 ∂M2 H

T

In this derivation, we have used the fact that the Hamiltonian H has the general form (6.1.25), and therefore, diamagnetic eﬀects (proportional to H 2 ) are negligible. Remark: In analogy to the extremal properties treated in Sect. 3.6.4, the canonical free energy F for ﬁxed T and H in magnetic systems strives towards a minimal value, as does the Helmholtz free energy A for ﬁxed T and M . At these minima, the stationarity conditions δF = 0 and δA = 0 hold, i.e.: dF < 0 when T and H are ﬁxed, and dA < 0 when T and M are ﬁxed.

6.1.2.4 Internal Energy E ≡ H is the internal energy, which is found in a natural manner from statistical mechanics. It contains the energy of the material including the eﬀects of the electromagnetic ﬁeld, but not the ﬁeld energy itself. It is usual to introduce a second internal energy, also, which we denote by U and which is deﬁned as U =E +M·H ;

(6.1.27a)

it thus has the complete diﬀerential dU = T dS + HdM . From this, we derive

∂U T = , ∂S M

H=

and the Maxwell relation

∂H ∂T = . ∂S M ∂M S 9

(6.1.27b)

∂U ∂M

(6.1.27c) S

(6.1.28)

See also R. B. Griﬃths, J. Math. Phys. 5, 1215 (1964). In fact, it is suﬃcient for the proof of (6.1.24a) to show that µ enters H linearly. Cf. M. E. Fisher, Rep. Progr. Phys. XXX, 615 (1967), p. 644.

276

6. Magnetism

Remarks: (i) As was emphasized in footnote 5, throughout this chapter the particle number and the volume are treated as ﬁxed. In the case of variable volume and variable particle number, the generalization of the First Law takes on the form dU = T dS − P dV + µdN + HdM

(6.1.29)

and, correspondingly, dE = T dS − P dV + µdN − MdH .

(6.1.30)

The grand potential Φ(T, V, µ, H) = −kT log Tr e−β(H−µN )

(6.1.31a)

then has the diﬀerential dΦ = −SdT − P dV − µdN − MdH ,

(6.1.31b)

where the chemical potential µ is not to be confused with the microscopic magnetic moment µ. (ii) We note that the free energies of the crystalline solid are not rotationally invariant, but instead are invariant only with respect to rotations of i the corresponding point group. Therefore, the susceptibility χij = ∂M ∂Hj is a second-rank tensor. In this textbook, we present the essential statistical methods, but we forgo a discussion of the details of solid-state physics or element speciﬁc aspects. The methods presented here should permit the reader to master the complications which arise in treating real, individual problems.

6.1.3 Supplementary Remarks (i) The Bohr–van Leeuwen Theorem. The content of the Bohr–van Leeuwen theorem is the nonexistence of magnetism in classical statistics. The classical partition function for a charged particle in the electromagnetic ﬁeld is given by 3N 3N d p d x −H({pi − e A(xi )},{xi })/kT c Zcl = . (6.1.32) e (2π)3N N ! Making the substitution pi = pi − ec A (xi ), we can see that Zcl becomes ∂F = 0, and independent of A and thus also of H . Then we have M = − ∂H 2 ∂ F χ = − V1 ∂H = 0. Since the spin is also a quantum-mechanical phenomenon, 2 dia-, para-, and ferromagnetism are likewise quantum phenomena. One might

6.1 The Density Matrix and Thermodynamics

277

ask how this statement can be reconciled with the ‘classical’ Langevin paramagnetism which will be discussed below. In the latter, a large but ﬁxed value of the angular momentum is assumed, so that a non-classical feature is introduced into the theory. In classical physics, angular momenta, atomic radii, etc. vary continuously and without limits.10 (ii) Here, we append the simple intermediate computations leading to (6.1.3)– δH (6.1.5). In (6.1.3), we need to evaluate −c δA(x) . The ﬁrst term in (6.1.1) evidently leads to the ﬁrst term in (6.1.3). In the component of the current jα , taking the derivative of the second term leads to

c

N X

δ δAα (x)

=c

N X i=1

µspin · curl A(xi ) = c i

i=1

µspin iβ βγδ

N X

∂ Aδ (xi ) = ∂x iγ i=1 ! N h i X spin =c rot µi δ (x − xi ) . δ

δAα (x)

∂ δ (x − xi ) δαδ ∂xiγ

µspin iβ βγδ

i=1

α

Pairs of Greek indices imply a summation. Since the derivative of the third term in (6.1.1) yields zero, we have demonstrated (6.1.3). (iii) In (6.1.4), the ﬁrst term is obtained in a readily-apparent manner from the ﬁrst term in (6.1.3). For the second term, we carry out an integration by parts and use ∂δ xβ = δδβ , obtaining N „Z h i« 1X δ (x − x ) = d3 x x × curl µspin i i 2 i=1 α N Z h i 1X δ (x − x ) = = d3 x αβγ xβ γδρ ∂δ µspin i iρ 2 i=1 N Z 1X =− d3 x αβγ γδρ δδβ µspin iρ δ (x − xi ) = 2 i=1 N Z N X 1X =− d3 x (−2δαρ ) µspin µspin iρ δ (x − xi ) = iα , 2 i=1 i=1 with which we have demonstrated (6.1.4). (iv) Finally, we show the validity of (6.1.5). We can write the vector potential of a uniform magnetic ﬁeld in the form A = 12 H × x, since curl A = 12 (H (∇ · x) − (H · ∇) x) yields H. To obtain the derivative, we use 12 σατ xiτ for the derivative with respect to Hα after the second equals sign below, ﬁnding −

N X ∂H e ” “ e” ∂ 1 2 “ =− pi − A σρτ Hρ xiτ + µspin − iα = ∂Hα 2m c c ∂H σ α 2 i=1 N “ e ”” e X“ xi × pi − A + µspin = iα , 2mc i=1 c α

(6.1.33)

which is in fact the right-hand side of (6.1.4). 10

A detailed discussion of this theorem and the original literature citations are to be found in J. H. van Vleck, The Theory of Electric and Magnetic Susceptibility, Oxford, University Press, 1932.

278

6. Magnetism

In the Hamiltonian (6.1.1), WCoul contains the mutual Coulomb interaction of the electrons and their interactions with the nuclei. The thermodynamic relations derived in Sect. (6.1.2) are thus generally valid; in particular, they apply to ferromagnets, since there the decisive exchange interaction is merely a consequence of the Coulomb interactions together with Fermi–Dirac statistics. In addition to the interactions included in (6.1.1), there are also the magnetic dipole interaction between magnetic moments and the spin-orbit interaction,11 which lead among other things to anisotropy eﬀects. The derived thermodynamic relations also hold for these more general cases, whereby the susceptibilities and speciﬁc heats become shape-dependent owing to the long-range dipole interactions. In Sect. 6.6, we will take up the eﬀects of the dipole interactions in more details. For elliptical samples, the internal magnetic ﬁeld is uniform, Hi = H − DM, where D is the demagnetizing tensor (or simply the appropriate demagnetizing factor, if the ﬁeld is applied along one of the principal axes). We will see that instead of the susceptibility with respect to the external ﬁeld H, one can employ the susceptibility with respect to the macroscopic internal ﬁeld, and that this susceptibility is shape-independent.12 In the following four sections, which deal with basic statistical-mechanical aspects, we leave the dipole interactions out of consideration; this is indeed quantitatively justiﬁed in many situations. In the next two sections 6.2 and 6.3, we deal with the magnetic properties of noninteracting atoms and ions; these can be situated within solids. The angular momentum quantum numbers of individual atoms in their ground states are determined by Hund’s rules.13

6.2 The Diamagnetism of Atoms We consider atoms or ions with closed electronic shells, such as for example helium and the other noble gases or the alkali halides. In this case, the quantum numbers of the orbital angular momentum and the total spin in the ground state are zero, S = 0 and L = 0, and as a result the total angular momentum J = L + S is also J = 0.14 Therefore, we have 11

12

13 14

The spin-orbit interaction ∝ L · S leads in eﬀective spin models to anisotropic interactions. The orbital angular momentum is inﬂuenced by the crystal ﬁeld of the lattice, transferring the anisotropy of the lattice to the spin. For non-elliptical samples, the magnetization is not uniform. In this case, ∂M ∂H depends on position within the sample and has only a local ` ∂M ´signiﬁcance. It is tot then expedient to introduce a total susceptibility χT,S = ∂H T,S , which diﬀers from (6.1.33) in the homogeneous case only by a factor of V . See e.g. QM I, Chap. 13 and Table I.12 The diamagnetic contribution is also present in other atoms, but in the magnetic ﬁelds which are available in the laboratory, it is negligible compared to the paramagnetic contribution.

6.2 The Diamagnetism of Atoms

279

L |0 = S |0 = J |0 = 0, where |0 designates the ground state. The paramagnetic contribution to the Hamiltonian (6.1.7) thus vanishes in every order of perturbation theory. It suﬃces to treat the remaining diamagnetic term in (6.1.7) in ﬁrst-order perturbation theory, since all the excited states lie at much higher energies. Owing to the of the wavefunc symmetry rotational tions of closed shells, we ﬁnd 0| i x2i + yi2 |0 = 23 0| i ri2 |0 and, for the energy shift of the ground state, e2 H 2 0| ri2 |0 . 2 12mc i

E1 =

(6.2.1)

From this it follows for the magnetic moment and the susceptibility of a single atom: e2 0| i ri2 |0 e2 0| i ri2 |0 ∂E1 ∂µz =− = − µz = − H, χ = , ∂H 6mc2 ∂H 6mc2 (6.2.2) where the sums run over all the electrons in the atom. The magnetic moment is directed oppositely to the applied ﬁeld and the susceptibility is negative. We can estimate the magnitude of this so called Langevin diamagnetism using the Bohr radius: 25 × 10−20 × 10−16 3 cm ≈ −5 × 10−30 cm3 , 6 × 10−27 × 1021 cm3 cm3 ≈ −3 × 10−6 . χ per mole = −5 × 10−30 × 6 × 1023 mole mole χ=−

The experimental values of the molar susceptibility of the noble gases are collected in Table 6.1. Table 6.1. Molar susceptibilities of the noble gases

−6

χ in 10

3

cm /mole

He

Ne

Ar

Kr

Xe

-1.9

-7.2

-15.4

-28.0

-43.0

An intuitively apparent interpretation of this diamagnetic susceptibility runs as follows: the ﬁeld H induces an additional current ∆j = −er∆ω, whereby the orbital frequency of the electronic motion increases by the Larmor frequency ∆ω = eH . The sign of this change corresponds to Lenz’s law, so that both the magnetic 2mc moment µz and the induced magnetic ﬁeld are opposite to the applied ﬁeld H: µz ∼

r∆j r 2 ∆ωe e2 r 2 H . ∼− ∼− 2c 2c 4mc2

We also note that the result (6.2.2) is proportional to the square of the Bohr radius and therefore to the fourth power of , conﬁrming the quantum nature of magnetic phenomena.

280

6. Magnetism

6.3 The Paramagnetism of Non-coupled Magnetic Moments Atoms and ions with an odd number of electrons, e.g. Na, as well as atoms and ions with partially ﬁlled inner shells, e.g. Mn2+ , Gd3+ , or U4+ (transition elements, ions which are isoelectronic with transition elements, rare-earth and actinide elements) have nonvanishing magnetic moments even when H = 0, µ=

e e (L + ge S) = (J + S) 2mc 2mc

(ge = 2) .

(6.3.1)

Here, J = L + S is the total angular momentum operator. For relatively low external magnetic ﬁelds (i.e. eH/mc spin-orbit coupling)) with H applied along the z-axis, the theory of the Zeeman eﬀect 15 gives the energy-level shifts ∆EMJ = gµB MJ H ,

(6.3.2)

where MJ runs over the values MJ = −J, . . . , J 16 and the Land´e factor g =1+

J(J + 1) + S(S + 1) − L(L + 1) 2J(J + 1)

(6.3.3)

was used. Familiar special cases are L = 0 : g = 2, MJ ≡ MS = ± 21 and S = 0 : g = 1, MJ ≡ ML = −L, . . . , L. The Land´e factor can be made plausible in the classical picture where L and S precess independently around the spatially ﬁxed direction of the constant of the motion J. Then we ﬁnd: J2 + J · S J · (L + 2S) Jz = Jz |J| |J| J2 ` 2 ´! 2 2 1 S + 2 J − L − S2 . = Jz 1 + J2

(L + 2S)z =

The partition function then becomes ) Z=

J

*N e

−ηm

m=−J

=

sinh η (2J + 1) /2 sinh η/2

N ,

(6.3.4)

with the abbreviation η=

gµB H . kT

Here, we have used the fact that 15 16

Cf. e.g. QM I, Sect. 14.2 J = L + S, Jz |mj = mj |mj

(6.3.5)

6.3 The Paramagnetism of Non-coupled Magnetic Moments J

e−ηm = e−ηJ

2J

eηr = e−ηJ

r=0

m=−J

eη(2J+1) − 1 = eη − 1

281

sinh η (2J + 1) /2 sinh η/2

For the free energy, we ﬁnd from (6.3.4) # " sinh η (2J + 1) /2 , F (T, H) = −kT N log sinh η/2

.

(6.3.6)

from which we obtain the magnetization M =−

1 ∂F = ngµB JBJ (η) V ∂H

(6.3.7)

(n = N V ). The magnetization is oriented parallel to the magnetic ﬁeld H. In Eq. (6.3.7) we have introduced the Brillouin function BJ , which is deﬁned as " # 1 1 1 η 1 BJ (η) = (J + ) coth η(J + ) − coth (6.3.8) J 2 2 2 2 (Fig. 6.1). We now consider the asymptotic limiting cases: 1 η J +1 + + O η 3 , BJ (η) = η + O η3 η 3 3 (6.3.9a)

η→0:

coth η =

η→∞:

BJ (∞) = 1.

and (6.3.9b)

Fig. 6.1. The Brillouin function for J = 1/2, 1, 3/2, 2, ∞ as a function of x = gµB J H = ηJ. For classical momentsB∞ is identical to the Langevin function kT

282

6. Magnetism

Inserting (6.3.9a) into (6.3.7), we obtain for low applied ﬁelds (H kT/JgµB ) J(J + 1)H , 3kT while from (6.3.9b), for high ﬁelds (H kT /JgµB ,), we ﬁnd M = n (gµB )

2

M = ngµB J

(6.3.10a)

(6.3.10b)

This signiﬁes complete alignment (saturation) of the magnetic moments. An important special case is represented by spin- 21 systems. Setting J = 12 in (6.3.8), we ﬁnd cosh η2 cosh2 η2 + sinh2 η2 η η B 12 (η) = 2 coth η − coth = − . η η η = tanh 2 sinh 2 cosh 2 sinh 2 2 (6.3.11) This result can be more directly obtained by using the fact that for spin S = 1/2, the partition function of a spin is given by Z = 2 cosh η/2 and the average value of the magnetization by M = ngµB Z −1 sinh η/2. Letting J = ∞, while at the same time gµB → 0, so that µ = gµB J remains ﬁnite, we ﬁnd B∞ (η) = coth ηJ −

µH kT 1 = coth − . ηJ kT µH

(6.3.12a)

B∞ (η) is called the Langevin function for classical magnetic moments µ; together with (6.3.7), it determines the magnetization

µH kT M = nµ coth − (6.3.12b) kT µH of “classical” magnetic moments of magnitude µ. A classical magnetic moment µ can be oriented in any direction in space; its energy is E = −µH cos ϑ, where ϑ is the angle between the ﬁeld H and themagnetic moment µ. The classical partition function for one particle is Z = dΩ e−E/kT and leads via (6.1.9b) once again to (6.3.12b). Finally, for the susceptibility we obtain 2

χ = n (gµB )

J B (η) . kT J

In small magnetic ﬁelds H χCurie = n (gµB )2

(6.3.13) kT JgµB ,

J(J + 1) . 3kT

this gives the Curie law (6.3.14)

The magnetic behavior of non-coupled moments characterized by (6.3.7), (6.3.13), and (6.3.14) is termed paramagnetism. The Curie law is typical of

6.3 The Paramagnetism of Non-coupled Magnetic Moments

283

Fig. 6.2. The entropy, the internal energy, and the speciﬁc heat of a spin- 21 paramagnet

preexisting elementary magnetic moments which need only be oriented by the applied ﬁeld, in contrast to the polarization of harmonic oscillators, whose moments are induced by the ﬁeld (cf. problem 6.4). We include a remark about the magnitudes. The diamagnetic susceptibility per mole, from the estimate which follows Eq. (6.2.2), is equal to about χmole ≈ −10−5 cm3 /mole. The paramagnetic susceptibility at room temperature is roughly 500 times larger, i.e. χmole ≈ 10−2 –10−3 cm3 /mole. The entropy of a paramagnet is * ) ) *

sinh η(2J+1) ∂F 2 − ηJBJ (η) . S=− = N k log (6.3.15) ∂T H sinh η2 For spin 12 , (6.3.15) simpliﬁes to

µB H µB H µB H − tanh S = N k log 2 cosh kT kT kT

(6.3.16)

with the limiting case S = N k log 2 for

H→0.

(6.3.16 )

The entropy, the internal energy, and the speciﬁc heat of the paramagnet are reproduced in Figs. 6.2a,b,c. The bump in the speciﬁc heat is typical of 2-level systems and is called a Schottky anomaly in connection with defects.

284

6. Magnetism

Van Vleck paramagnetism: The quantum number of the total angular momentum also becomes zero, J = 0, when a shell has just one electron less than half full. In this case, according to Eq. (6.3.2) we have indeed 0| J + S |0 = 0, but the paramagnetic term in (6.1.7) yields a nonzero contribution in second order perturbation theory. Together with the diamagnetic term, one obtains for the energy shift of the ground state ∆E0 = −

X | 0| (L + 2S) · H |n |2 X 2 e2 H 2 + 0| (xi + yi2 ) |0 . 2 E − E 8mc n 0 n i

(6.3.17)

The ﬁrst, paramagnetic term, named for van Vleck 17 , which also plays a role in the magnetism of molecules18 , competes with the diamagnetic term.

6.4 Pauli Spin Paramagnetism We consider now a free, three-dimensional electron gas in a magnetic ﬁeld and restrict ourselves initially to the coupling of the magnetic ﬁeld to the electron spins. The energy eigenvalues are then given by Eq. (6.1.7): p± =

1 p2 ± ge µB H . 2m 2

(6.4.1)

The energy levels are split by the magnetic ﬁeld. Electrons whose spins are aligned parallel to the ﬁeld have higher energies, and these states are therefore less occupied (see Fig. 6.3).

Fig. 6.3. Orientation (a) of the spins, and (b) of the magnetic moments. (c) The energy as a function of p (on the left for positive spins and on the right for negative spins)

17

18

J. H. van Vleck, The Theory of Magnetic and Electric Susceptiblities, Oxford University Press, 1932. Ch. Kittel, Introduction to Solid State Physics, Third edition, John Wiley, New York, 1967

6.4 Pauli Spin Paramagnetism

285

The number of electrons in the two states is found to be

2 ∞ 1 1 V p 1 3 ± g ν()n ± g d p n µ H = d µ H , N± = e B e B 3 2m 2 2 2 (2π) 0

(6.4.2) where the density of states has been introduced: gV 3 1/2 3 ν() = d p δ( − ) = N ; (6.4.3) p 3 2 3/2 (2π) F it fulﬁlls the normalization condition 0 F d ν() = N . In the case that ge µB H µ ≈ F , we can expand in terms of H: ∞ 1 1 N± = d ν() n () ± n () ge µB H + O H 2 . (6.4.4) 2 2 0

For the magnetization, using the above result we obtain: ∞ µ2B H M = −µB (N+ − N− )/V = − d ν()n () + O H 3 , V

(6.4.5)

0

where we have set ge = 2. For T → 0, we ﬁnd from (6.4.5) the magnetization 3 NH M = µ2B ν(F )H/V + O H 3 = µB 2 (6.4.6) + O H3 2 V F and the magnetic susceptibility N 3 χP = µ2B (6.4.7) + O H2 . 2 V F This result describes the phenomenon of Pauli spin paramagnetism. Supplementary remarks: (i) For T = 0, we must take the change of the chemical potential into account, making use of the Sommerfeld expansion: ∞ µ 2 2 π 2 (kT ) ν (µ) + O H 2 , T 4 N = d ν()n() + O H = d ν() + 6 0

0

F

π 2 (kT ) ν (F ) + O H 2 , T 4 . 6 2

d ν() + (µ − F ) ν(F ) +

=

(6.4.8)

0

Since the ﬁrst term on the right-hand side is equal to N , we write π 2 (kT ) ν (F ) + O H 2, T 4 . 6 ν(F ) 2

µ − F = −

(6.4.9)

286

6. Magnetism

Integrating by parts, we obtain from (6.4.5) and (6.4.9) µ2 H M= B V

∞

d ν ()n() + O H 3

0

2 3 4 π 2 (kT ) ν(µ) + ν (µ) + O H , T (6.4.10) = V 6 2 ν (F )2 π 2 (kT ) µ2B H ν(F ) − − ν (F ) + O H 3, T 4 . = V 6 ν(F ) µ2B H

(ii) The Pauli susceptibility (6.4.7) can be interpreted similarly to the linear speciﬁc heat of a Fermi gas (see Sect. 4.3.2): χP = χCurie

ν(F ) kT = µ2B ν(F )/V . N

(6.4.11)

Naively, one might expect that the susceptibility of N electrons would be equal to the Curie susceptibility χCurie from Eq. (6.3.14) and therefore would diverge as 1/T . It was Pauli’s accomplishment to realize that not all of the electrons contribute, but instead only those near the Fermi energy. The number of thermally excitable electrons is kT ν(F ). (iii) The Landau quasiparticle interaction (see Sect. 4.3.3e, Eq. 4.3.29c) yields χP =

µ2B ν(F ) . V (1 + Fa )

(6.4.12)

In this expression, Fa is an antisymmetric combination of the interaction parameters.19 (iv) In addition to Pauli spin paramagnetism, the electronic orbital motions give rise to Landau diamagnetism 20 χL = −

e2 kF . 12π 2 mc2

(6.4.13)

For a free electron gas, χL = − 31 χP . The lattice eﬀects in a crystal have diﬀering consequences for χL and χP . Eq. (6.4.13) holds for free electrons neglecting the Zeeman term. The magnetic susceptibility for free spin- 12 fermions is composed of three parts: it is the sum χ = χP + χL + χOsc . 19

20

D. Pines and Ph. Nozi`eres, The Theory of Quantum Liquids Vol. I: Normal Fermi Liquids, W. A. Benjamin, New York 1966, p. 25 See e.g. D. Wagner, Introduction to the Theory of Magnetism, Pergamon Press, Oxford, 1972.

6.5 Ferromagnetism

287

χOsc is an oscillatory part, which becomes important at high magnetic ﬁelds H and is responsible for de Haas–van Alphen oscillations. (v) Fig. 6.3c can also be read diﬀerently from the description given above. If one introduces the densities of states for spin ±/2 V ν± () = d3 p δ( − p± ) 3 (2π) p2 V 1 3 = d p δ − µ H ± g e B 3 2m 2 (2π) ∞

1 1 mV dp p Θ ∓ ge µB H δ p − 2m ∓ ge µB H = 2π 2 3 2 2 0

=N

3 3/2 4F

1/2 1 1 Θ ∓ ge µB H ∓ ge µB H , 2 2

then the solid curves which are drawn on the left and the right also refer to ν+ () and ν− ().

6.5 Ferromagnetism 6.5.1 The Exchange Interaction Ferromagnetism and antiferromagnetism are based on variations of the exchange interaction, which is a consequence of the Pauli principle and the Coulomb interaction (cf. the remark following Eq. (6.1.33)). In the simplest case of the exchange interaction of two electrons, two atoms or two molecules with the spins S1 and S2 , the interaction has the form ±J S1 · S2 , where J is a positive constant which depends on the distance between the spins. The exchange constant ±J is determined by the overlap integrals, containing the Coulomb interaction.21 When the exchange energy is negative, E = −J S1 · S2 ,

(6.5.1a)

then a parallel spin orientation is favored. This leads in a solid to ferromagnetism (Fig. 6.4b); then below the Curie temperature Tc , a spontaneous magnetization occurs within the solid. When the exchange energy is positive, E = J S1 · S2 ,

(6.5.1b)

then an antiparallel spin orientation is preferred. In a suitable lattice structure, this can lead to an antiferromagnetic state: below the N´eel temperature TN , an alternating (staggered) magnetic order occurs (Fig. 6.4c). Above the 21

See Chaps. 13 and 15, QM I

288

6. Magnetism

Fig. 6.4. A crystal lattice of magnetic ions. The spin Sl is located at the position xl , and l denumerates the lattice sites. (a) the paramagnetic state; (b) the ferromagnetic state; (c) the antiferromagnetic state.

respective transition temperature (TC or TN ), a paramagnetic state occurs (Fig. 6.4a). The exchange interaction is, to be sure, short-ranged; but owing to its electrostatic origin it is in general considerably stronger than the dipoledipole interaction. Examples of ferromagnetic materials are Fe, Ni, EuO; and typical antiferromagnetic materials are MnF2 and RbMnF3 . In the rest of this section, we turn to the situation described by equation (6.5.1a), i.e. to ferromagnetism, and return to (6.5.1b) only in the discussion of phase transitions. We now imagine that the magnetic ions are located on a simple cubic lattice with lattice constant a, and that a negative exchange interaction (J > 0) acts between them (Fig. 6.4a). The lattice sites are enumerated by the index l. The position of the lth ion is denoted by xl and its spin is Sl . All the pairwise interaction energies of the form (6.5.1a) contribute to the total Hamiltonian 22 : 1 H=− Jll Sl · Sl . (6.5.2) 2 l,l

Here, we have denoted the exchange interaction between the spins at the lattice sites l and l by Jll . The sum runs over all l and l , whereby the factor 1/2 guarantees that each pair of spins is counted only once in (6.5.2). The exchange interaction obeys Jll = Jl l , and we set Jll = 0 so that we do not need to exclude the occurrence of the same l-values in the sum. The Hamiltonian (6.5.2) represents the Heisenberg model 23 . Since only scalar products of spin vectors occur, it has the following important property: H 22

23

In fact, there are also interactions within a solid between more than just two spins, which we however neglect here. The direct exchange described above occurs only when the moments are near enough so that their wavefunctions overlap. More frequently, one ﬁnds an indirect exchange, which couples more distant moments. The latter acts via an intermediate link, which can be a quasi-free electron in a metal or a bound electron in an insulator. The resulting interaction is called in the ﬁrst case the RKKY (Rudermann, Kittel, Kasuya, Yosida) interaction and in the second, it is referred

6.5 Ferromagnetism

289

is invariant with respect to a common rotation of all the spin vectors. No direction is especially distinguished and therefore the ferromagnetic order which can occur may point in any arbitrary direction. Which direction is in fact chosen by the system is determined by small anisotropy energies or by an external magnetic ﬁeld. In many substances, this rotational invariance is nearly ideally realized, e.g. in EuO, EuS, Fe and in the antiferromagnet RbMnF3 . In other cases, the anisotropy of the crystal structure may have the eﬀect that the magnetic moments orient in only two directions, e.g. along the positive and negative z-axis, instead of in an arbitrary spatial direction. This situation can be described by the Ising model H=−

1 Jll Slz Slz . 2

(6.5.3)

l,l

This model is considerably simpler than the Heisenberg model (6.5.2), since the Hamiltonian is diagonal in the spin eigenstates of Slz . But even for (6.5.3), the evaluation of the partition function is in general not trivial. As we shall see, the one-dimensional Ising model can be solved exactly in an elementary way for an interaction restricted to the nearest neighbors. The solution of the two-dimensional model, i.e. the calculation of the partition function, requires special algebraic or graph-theoretical methods, and in three dimensions the model has yet to be solved exactly. When the lattice contains N sites, then the partition function Z = Tr e−βH has contributions from all together 2N conﬁgurations (every spin can take on the two values ±/2 independently of all the others). A naive summation over all these conﬁgurations is possible even for the Ising model only in one dimension. In order to understand the essential physical eﬀects which accompany ferromagnetism, in the next section we will apply the molecular ﬁeld approximation. It can be carried out for all problems related to ordering. We will demonstrate it using the Ising model as an example. 6.5.2 The Molecular Field Approximation for the Ising Model We consider the Hamiltonian of the Ising model in an external magnetic ﬁeld H=−

1 J(l − l ) σl σl − h σl . 2 l,l

(6.5.4)

l

to as superexchange (see e.g. C. M. Hurd, Contemp. Phys. 23, 469 (1982)). Also in cases where direct exchange is not predominant and even for itinerant magnets (with 3d and 4s electrons which are not localized, but instead form bands), the magnetic phenomena, in particular their behavior near the phase transition, can be described using an eﬀective Heisenberg model. A derivation of the Heisenberg model from the Hubbard model can be found in D. C. Mattis, The Theory of Magnetism, Harper and Row, New York, 1965.

290

6. Magnetism

In comparison to (6.5.3), equation (6.5.4) contains some changes of notation. Instead of the spin operators Slz , we have introduced the Pauli spin matrices σlz and use the eigenstates of the σlz as basis functions; their eigenvalues are σl = ±1 for every l . The Hamiltonian becomes simply a function of (commuting) numbers. By writing the exchange interaction in the form J(l − l ) (J(l − l ) = J(l − l) = Jll 2 /4, J(0) = 0), we express the fact that the system is translationally invariant, i.e. J(l − l ) depends only on the distance between the lattice sites. The eﬀect of an applied magnetic ﬁeld is represented by the term −h l σl . The factor − 12 gµB has been combined with the magnetic ﬁeld H into h = − 12 gµB H; the sign convention for h is chosen so that the σl are aligned parallel to it. Due to the translational invariance of the Hamiltonian, it proves to be expedient for later use to introduce the Fourier transform of the exchange coupling, ˜ J(k) = J(l)e−ik·xl . (6.5.5) l

˜ Frequently, we will require J(k) for small wavenumbers k. Due to the ﬁnite range of J(l − l ), we can expand the exponential functions in (6.5.5) ˜ J(k) =

J(l) −

l

1 2 (k · xl ) J(l) + . . . . 2

(6.5.5 )

l

For cubic and square lattices, and in general when reﬂection symmetry is present, the linear term in k makes no contribution. We can interpret the Hamiltonian (6.5.4) in the following manner: for some conﬁguration of all the spins σl , a local ﬁeld hl = h + J(l − l ) σl (6.5.6) l

acts on an arbitrarily chosen spin σl . If hl were a ﬁxed applied ﬁeld, we could immediately write down the partition function for the spin σl . Here, however, the ﬁeld hl depends on the conﬁguration of the spins and the value of σl itself enters into the local ﬁelds which act upon its neighbors. In order to avoid this diﬃculty by means of an approximation, we replace the local ﬁeld (6.5.6) by its average value, i.e. by the mean ﬁeld ˜ hl = h + J(l − l )σl = h + J(0)m . (6.5.7) l

In the second part of this equation, we have introduced the average value m = σl ,

(6.5.8)

6.5 Ferromagnetism

291

which is position-independent, owing to the translational invariance of the Hamiltonian; thus m refers to the average magnetization per lattice site (per spin). Furthermore, we use the abbreviation ˜ ≡ J˜ ≡ J(0) J(l) (6.5.9) l

for the Fourier transform at k = 0 (see (6.5.5 )). Eq. (6.5.7) contains, in ˜ The density matrix addition to the external ﬁeld, the molecular ﬁeld Jm. then has the simpliﬁed form ˜ ρ∝ eσl (h+Jm)/kT . l

Formally, we have reduced the problem to that of a paramagnet, where the molecular ﬁeld must still be determined self-consistently from the magnetization (6.5.8). We still want to derive the molecular ﬁeld approximation, justiﬁed above with intuitive arguments, in a more formal manner. We start with an arbitrary interaction term in (6.5.4), −J(l − l )σl σl , and rewrite it up to a prefactor as follows: σl σl = σl + σl − σl σl + σl − σl = σl σl + σl σl − σl (6.5.10) + σl σl − σl + σl − σl σl − σl . Here, we have ordered the terms in powers of the deviation from the mean value. We now neglect terms which are nonlinear in these ﬂuctuations. This yields the following approximate replacements: σl σl → −σl σl + σl σl + σl σl ,

(6.5.10 )

which lead from (6.5.4) to the Hamiltonian in the molecular ﬁeld approximation 1 ˜ ˜ − HMFT = m2 N J(0) . (6.5.11) σl h + J(0)m 2 l

We refer to the Remarks for comments about the validity and admissibility of this approximation. With the simpliﬁed Hamiltonian (6.5.11), we obtain the density matrix P

−1 ρMFT = ZMFT eβ [

l

2 ˜ 1 ˜ σl (h+Jm)− 2 m JN ]

(6.5.12)

292

6. Magnetism

and the partition function ZMFT = Tr e

P 2 ˜ 1 ˜ β [ l σl (h+Jm)− 2 m JN]

=

)

l

*

e

˜ βσl (h+Jm)

e− 2 βm 1

2

˜ JN

σl =±1

(6.5.13) in the molecular ﬁeld approximation, where Tr ≡ for (6.5.13) 1 2 N ˜ ˜ ZMFT = e− 2 βm J 2 cosh β h + Jm .

{σl =±1} .

We thus ﬁnd

(6.5.13 )

∂ log ZMFT , we obtain the equation of state in the molecular Using m = N1 kT ∂h ﬁeld approximation: ˜ + h) , m = tanh β(Jm (6.5.14)

which is an implicit equation for m. Compared to the equation of state of ˜ a paramagnet, the ﬁeld h is ampliﬁed by the internal molecular ﬁeld Jm. As we shall see later, (6.5.14) can be solved analytically for h. It is however instructive to solve (6.5.14) ﬁrst for limiting cases. To do this, it will prove expedient to introduce the following abbreviations: Tc =

J˜ k

and τ =

T − Tc . Tc

(6.5.15)

We will immediately see that Tc has the signiﬁcance of the transition temperature, the Curie temperature. Above Tc , the magnetization is zero in the absence of an applied ﬁeld; below this temperature, it increases continuously with decreasing temperature to a ﬁnite value. We ﬁrst determine the behavior in the neighborhood of Tc , where we can expand in terms of τ, h and m. a) h = 0: For zero applied ﬁeld and in the vicinity of Tc , (6.5.14) can be expanded in a Taylor series, 3

1 Tc Tc ˜ m− m + ... m = tanh β J m = T 3 T

(6.5.16)

which can be cut oﬀ at the third order so as to retain the leading term of the solution. The solutions of (6.5.16) are m = 0 for

T > Tc

m = ±m0 ,

m0 =

(6.5.17a)

and √ 1/2 3(−τ )

for T < Tc .

(6.5.17b)

6.5 Ferromagnetism

293

The ﬁrst solution, m = 0, is found for all temperatures, the second only for T ≤ Tc , i.e. τ ≤ 0. Since the free energy of the second solution is smaller (see below and in Fig. 6.9), it is the stable solution below Tc . From these considerations we ﬁnd the temperature ranges given in (6.5.17). For T ≤ Tc , the spontaneous magnetization, denoted as m0 , is observed (6.5.17b); it follows a square-root law (Fig. 6.5). This quantity is called the order parameter of the ferromagnetic phase transition.

Fig. 6.5. The spontaneous magnetization (solid curve), and the magnetization in an applied ﬁeld (dashed). The spontaneous magnetization in the Ising model has two possible orientations, +m0 or −m0

b) h and τ nonzero: for small h and τ and thus small m, the expansion of (6.5.14)

3 Tc h 1 h Tc m 1− = − + m + ... , T kT 3 kT T leads to the magnetic equation of state h 1 = τ m + m3 kTc 3

(6.5.18)

in the neighborhood of Tc . An applied magnetic ﬁeld produces a ﬁnite magnetization even above Tc and leads qualitatively to the dashed curve in Fig. 6.5. c) τ = 0 : exactly at Tc , we ﬁnd from (6.5.18) the critical isotherm: 1/3

3h , h ∼ m3 . (6.5.19) m= kTc d) Susceptibility for small τ : we now compute the isothermal magnetic susceptibility χ = ∂m ∂h T , by differentiating the equation of state (6.5.18) with respect to h 1 = τ χ + m2 χ . kTc

(6.5.20)

In the limit h → 0, we can insert the spontaneous magnetization (6.5.17) into (6.5.20) and obtain for the isothermal magnetic susceptibility

294

6. Magnetism

Fig. 6.6. The magnetic susceptibility (6.5.21): the Curie–Weiss law

⎧ 1/k ⎪ ⎪ ⎪ ⎪ ⎨ (T − Tc )

1/kTc χ= = ⎪ τ + m2 ⎪ ⎪ ⎪ ⎩

T > Tc ;

1/k 2(Tc − T )

(6.5.21)

T < Tc

this is the Curie–Weiss law shown in Fig. 6.6. Remark: We can understand the divergent susceptibility at Tc by starting from the Curie law for paramagnetic spins (6.3.10a), adding the internal molecular ﬁeld J˜m to the ﬁeld h, and then determining the magnetization from it: m=

1 ˜ → m = 1/k . (h + Jm) kT h T − Tc

(6.5.22)

Following these limiting cases, we solve (6.5.14) generally. We ﬁrst discuss the graphical solution of this equation, referring to Fig. 6.7. e) A graphical solution of the equation m = tanh β(h + J˜m) To ﬁnd a graphical solution, it is expedient to introduce the auxiliary variable y = m + kThc . Then one ﬁnds m as a function of h by determining the intersection of the line y − kThc with tanh TTc y: m=y−

Tc h = tanh y . kTc T

For T ≥ Tc , Fig. 6.7a exhibits exactly one intersection for each value of h. This yields the monotonically varying curve for T ≥ Tc in Fig. 6.8. For T < Tc , from Fig. 6.7b the slope of tanh TTc y at y = 0 is greater than 1 and therefore we ﬁnd three intersections for small absolute values of h, while the solution for high ﬁelds remains unique. This leads to the function for T < Tc which is shown in Fig. 6.8.

6.5 Ferromagnetism

295

Fig. 6.7. The graphical solution of Eq. (6.5.14).

Fig. 6.8. The magnetic equation of state in the molecular ﬁeld approximation (6.5.23). The dotted vertical line on the maxis represents the inhomogeneous state (6.5.28)

For small h, m(h) is not uniquely determined. Particularly noticeable is the fact that the S-shaped curve m(h) contains a section with negative slope, i.e. negative susceptibility. In order to clarify the stability of the solution, we need to consider the free energy. We ﬁrst note that for large h, the magnetization approaches its saturation value (Fig. 6.7). In fact, one can immediately compute the function h(m) from Eq. (6.5.14) analytically, since from ˜ + h) = arctanh m ≡ β(Jm

1 1+m log 2 1−m

the equation of state 1+m kT log (6.5.23) 2 1−m follows. Its shape is shown in Fig. 6.8 for T ≶ Tc at the two values T = 0.8 Tc and 1.2 Tc taken as examples, in agreement with the graphical construction. h = −kTc m +

296

6. Magnetism

As mentioned above, for a given ﬁeld h, the value of the magnetization below Tc is not everywhere unique; e.g. for h = 0, the three values 0 and ±m0 occur. In order to ﬁnd out which parts of the equation of state are physically stable, we must investigate the free energy. The free energy in the molecular ﬁeld approximation, F = −kT log ZMFT , per lattice site and in units of the Boltzmann constant, is given from (6.5.13 ) by f (T, h) =

2 3 1 F = Tc m2 − T log 2 cosh (Tc m + h/k)/T Nk 2

(6.5.24)

1 Tc ≈ (T − Tc )m2 + m4 − mh/k − T log 2 . 2 12 We give here in the ﬁrst line the complete expression, and in the second line the expansion in terms of m, h and T − Tc, which applies in the neighborhood of the phase transition. Here, m = m(h) must still be inserted. From (6.5.24), the heat capacity at vanishing applied ﬁeld (for T ≈ Tc ) can be found: 0 T > Tc ∂ 2 f ch=0 = −N kT = 3 ; T ∂T 2 h=0 N k T < Tc 2 Tc here, a jump of magnitude ∆ch=0 = 32 N k is seen. We calculate directly the Helmholtz free energy a(T, m) = f + mh/k =

2 3 1 Tc m2 − T log 2 cosh (Tc m + h/k)/T + mh/k , 2

(6.5.25)

in which h = h(m) is to be inserted. From the determining equation for m (6.5.14), it follows that 2 3 T log 2 cosh (Tc m + h/k)/T = ) *1/2 1 = T log 2 + T log 1 − tanh2 (Tc m + h/k)/T = T log 2 −

T log(1 − m2 ) . 2

Combining this with (6.5.23) and inserting into (6.5.25), we obtain 1+m 1 Tm 1 log a(T, m) = − Tc m2 − T log 2 + T log(1 − m2 ) + 2 2 2 1−m (6.5.26) 1 Tc 4 2 ≈ −T log 2 + (Tc − T )m + m ; 2 12

6.5 Ferromagnetism

297

Fig. 6.9. The Helmholtz free energy in the molecular ﬁeld approximation above and below Tc , for T = 0.8 Tc and T = 1.2 Tc .

here, the second line holds near Tc . The Helmholtz free energy above and below Tc is shown in Fig. 6.9. We ﬁrst wish to point out the similarity of the free energy for T < Tc with that of the van der Waals gas. For temperatures T < Tc , there is a region in a(T, m) which violates the stability criterion (6.1.24a). The magnetization can be read oﬀ from Fig. 6.9 using

∂a h=k , (6.5.27) ∂m T by drawing a tangent with the slope h to the function a(T, m). Above Tc , this construction gives a unique answer; below Tc , however, it is unique only for a suﬃciently strong applied ﬁeld. We continue the discussion of the lowtemperature phase and determine the reorientation of the magnetization on changing the direction of the applied magnetic ﬁeld, starting with a magnetic ﬁeld h for which only a single value of the magnetization results from the tangent construction. Lowering the ﬁeld causes m to decrease until at h = 0, the value m0 is obtained. Exactly the same tangent, namely that with slope zero, applies to the point −m0 . Regions of magnetization m0 and −m0 can therefore be present in equilibrium with each another. When a fraction c of the body has the magnetization −m0 and a fraction 1 − c has the magnetization m0 , then for 0 ≤ c ≤ 1 the average magnetization is m = −cm0 + (1 − c)m0 = (1 − 2c)m0

(6.5.28)

in the interval between −m0 and m0 . The free energy of this inhomogeneously magnetized object is a(m0 ) (dotted line in Fig. 6.9), and is thus lower than the part of the molecular-ﬁeld so-

298

6. Magnetism

lution which arches upwards and which corresponds to a homogeneous state in the coexistence region of the two states +m0 and −m0 . In the interval [−m0 , m0 ], the system does not enter the homogeneous state with its higher free energy, but instead breaks up into domains24 which according to Eq. (6.5.28) yield all together the magnetization m. We remind the reader of the analogy to the Maxwell construction in the case of a van der Waals liquid. For completeness, we compare the free energies of the magnetization states belonging to a small but nonzero h. Without loss of generality we can assume that h is positive. Along with the positive magnetization, for small h there are also two solutions of (6.5.27) with negative magnetizations. It is clear from Fig. 6.9 that the latter two have higher free energies than the solution with positive magnetization. For a positive (negative) magnetic ﬁeld, the state with positive (negative) magnetization is thermodynamically stable. The Sshaped part of the equation of state (for T < Tc ) in Fig. 6.8 is thus replaced by the dotted vertical line. Finally, we give the entropy in the molecular ﬁeld approximation:

S ∂a 1+m 1−m 1−m 1+m s= =− log + log ; (6.5.29) =− Nk ∂T m 2 2 2 2 it depends only on the average magnetization m. The internal energy is given by e=

E 1 = a − mh/k + T s = − Tc m2 − mh/k . Nk 2

(6.5.30)

This can be more readily seen from (6.5.11) by taking an average value H it again follows that with the density matrix (6.5.12). From h = k ∂a(T,m) ∂m Tc m+h/k m = tanh , i.e. we recover Eq. (6.5.14). T Remarks: (i) The molecular ﬁeld approximation can also be applied to other models, for example the Heisenberg model, and also for quite diﬀerent cooperative phenomena. The results are completely analogous. (ii) The eﬀect of the remaining spins on an arbitrarily chosen spin is replaced in molecular ﬁeld theory be a mean ﬁeld. In the case of a short-range interaction, the real ﬁeld conﬁguration will deviate considerably from this mean value. The more long-ranged the interaction, the more spins contribute to the local ﬁeld, and the more closely it thus approaches the average ﬁeld. The 24

The number of domains can be greater than just two. When there are only a few domains, the interface energy is negligible in comparison to the gain in volume energy; see problem 7.6. In reality, the dipole interaction, anisotropies and inhomogeneities in the crystal play a role in the formation of domains. They form in such a way that the energy including that of the magnetic ﬁeld is minimized.

6.5 Ferromagnetism

299

molecular ﬁeld approximation is therefore exact in the limit of long-range interactions (see also problem 6.13, the Weiss model). We note here the analogy between the molecular ﬁeld theory and the Hartree-Fock theory of atoms and other many-body systems. (iii) We want to point out another aspect of the molecular ﬁeld approximation: its results do not depend at all on the dimensionality. This contradicts intuition and also exact calculations. In the case of short-range interactions, one-dimensional systems in fact do not undergo a phase transition; there are too few neighbors to lead to a cooperative ordering phenomenon. (iv) In the next chapter, we shall turn to a detailed comparison of the gasliquid transition and the ferromagnetic transition. We point out here in anticipation that the van der Waals liquid and the ferromagnet show quite similar behavior in the immediate vicinity of their critical points in the molecular 1/2 1/2 ﬁeld approximation; e.g. (ρG − ρc ) ∼ (Tc − T ) and M0 ∼ (Tc − T ) , and likewise, the isothermal compressibility and the magnetic susceptibility both −1 diverge as (Tc − T ) . This similarity is not surprising; in both cases, the interactions with the other gas atoms or spins is replaced by a mean ﬁeld which is determined self-consistently from the ensuing equation of state. (v) If one compares the critical power laws (6.5.17), (6.5.19), and (6.5.21) with experiments, with the exact solution of the two-dimensional Ising model, and with numerical results from computer simulations or series expansions, it is found that in fact qualitatively similar power laws hold, but the critical exponents are diﬀerent from those found in the molecular ﬁeld theory. The lower the dimensionality, the greater the deviations found. Instead of (6.5.17), (6.5.19), and (6.5.21), one ﬁnds generalized power laws: β

T < Tc ,

(6.5.31a)

1/δ

m0 ∼|τ | m ∼h

T = Tc ,

(6.5.31b)

χ ∼ |τ |

−γ

T ≷ Tc ,

(6.5.31c)

ch ∼ |τ |

−α

T ≷ Tc .

(6.5.31d)

The critical exponents β, δ, γ and α which occur in these expressions in general diﬀer from their molecular ﬁeld values 1/2, 3, 1 and 0 (corresponding to the jump). For instance, in the two-dimensional Ising model, β = 1/8, δ = 15, γ = 7/4, and α = 0 (logarithmic). Remarkably, the values of the critical exponents do not depend on the lattice structure, but only on the dimensionality of the system. All Ising systems with short-range forces have the same critical exponents in d dimensions. Here, we have an example of the so called universality. The critical behavior depends on only a very few quantities, such as the dimensionality of the system, the number of components of the order parameter and the symmetry of the Hamiltonian. Heisenberg ferromagnets have diﬀerent critical exponents from Ising ferromagnets, but within these groups, they are all the same. With these remarks about the actual behavior in the neighborhood of a critical

300

6. Magnetism

point, we will close the discussion. In particular, we postpone the description of additional analogies between phase transitions to the next chapter. We now return to the molecular ﬁeld approximation and use it to compute the magnetic susceptibility and the position-dependent spin correlation function. 6.5.3 Correlation Functions and Susceptibility In this subsection, we shall consider the Ising model in the presence of a spatially varying applied magnetic ﬁeld hl . The Hamiltonian is then given by H = H0 −

hl σl = −

l

1 J (l − l ) σl σl − hl σl . 2 l,l

(6.5.32)

l

The magnetization per spin at position l now depends on the lattice site l:

ml = σl ≡ Tr e−βH σl /Tr e−βH . (6.5.33) We ﬁrst deﬁne the susceptibility χ (xl , xl ) =

∂ml , ∂hl

(6.5.34)

which describes the response at the site l to a change in the ﬁeld at the site l . The correlation function is deﬁned as G (xl , xl ) ≡ σl σl − σl σl = (σl − σl )(σl − σl ) .

(6.5.35)

The correlation function (6.5.35) is a measure of how strongly the deviations from the mean values at the sites l and l are correlated with each other. Susceptibility and correlation function are related through the important ﬂuctuation-response theorem χ(xl , xl ) =

1 G (xl , xl ) . kT

(6.5.36)

This theorem (6.5.36) can be derived by taking the derivative of (6.5.33) with respect to hl . For a translationally invariant system, we have χ(xl , xl )|{hl =0} = χ (xl − xl )

and

G(xl , xl )|{hl =0} = G (xl − xl ) . (6.5.37)

At small ﬁelds hl , we ﬁnd (ml ≡ ml − m) χ (xl − xl ) hl . ml = l

(6.5.38)

6.5 Ferromagnetism

301

A periodic ﬁeld hl = hq eiqxl

(6.5.39)

therefore gives rise to a magnetization of the form χ (xl − xl ) e−iq(xl −xl ) hq = χ (q) eiqxl hq , ml = eiqxl

(6.5.40)

l

where χ (q) =

χ (xl − xl ) e−iq(xl −xl ) =

l

1 G (xl ) e−iqxl kT

(6.5.41)

l

is the Fourier transform of the susceptibility, and following the equals sign (6.5.36) has been inserted. In particular for q = 0, we ﬁnd the following relation between the uniform susceptibility and the correlation function: χ ≡ χ (0) =

1 G (xl ) . kT

(6.5.42)

l

Since the correlation function (6.5.35) can never be greater than 1, (|σl | = 1), and is in no case divergent, the divergence of the uniform susceptibility, Eq. (6.5.21) (i.e. the susceptibility referred to a spatially uniform ﬁeld) can only be due to the fact that the correlations at Tc attain an inﬁnitely long range. 6.5.4 The Ornstein–Zernike Correlation Function We now want to calculate the correlation function introduced in the previous section within the molecular ﬁeld approximation. As before, we denote the ﬁeld by hl , so that the mean value ml = σl is also site dependent. In the molecular-ﬁeld approximation, the density matrix is given by ρMFT = Z −1 exp β σl (hl + J(l − l )σl ) . (6.5.43) l

l

The Fourier transform of the exchange coupling, which we take to be shortranged, can be written for small wavenumbers as ˜ J(k) ≡

l

J(l)e−ikxl ≈ J˜ − k2

1 2 xl J(l) ≡ J˜ − k 2 J . 6

(6.5.44)

l

Here, we have replaced the exponential function by its Taylor series. Due to the mirror symmetry of a cubic lattice, J (−l) = J (l), and therefore there is 2 no linear term in k. Furthermore, we have (k · xl ) J (l) = 13 k 2 x2l J (l) . l

l

302

6. Magnetism

The constant J is deﬁned by J=

1 2 xl J(l) . 6

(6.5.45)

l

Using the density matrix (6.5.43), we obtain for the mean value of σl , in analogy to (6.5.14) in Sect. 6.5.2, the result σl = tanh β(hl + J(l − l )σl ) . (6.5.46) l

We now take the derivative ∂h∂ of the last equation (6.5.46),and ﬁnally set l all the hl = 0, obtaining for the susceptibility: 1

× cosh [ β l J(l − l )m] J(l − l ) χ(xl − xl ) × βδll + β

χ(xl − xl ) =

2

(6.5.47)

l

The Fourier-transformed susceptibility (6.5.41) is obtained from (6.5.47), recalling the convolution theorem: 1 ˜(q)χ(q) . β + β J (6.5.48) χ(q) = ˜ cosh2 β Jm 1 Furthermore, using cosh2 β J˜m = 1−tanh1 2 β Jm = 1−m 2 , where we have in˜ serted the determining equation for m, Eq. (6.5.16), we obtain the general result

χ(q) =

1 1−m2

β . − β J˜(q)

(6.5.49)

From this last equation, together with (6.5.15) and (6.5.44), we ﬁnd in the neighborhood of Tc : χ (q) =

β 1−

Tc T

+ m20 +

Jq2 kT

for T ≈ Tc

(6.5.50)

or also χ (q) =

1 , J (q 2 + ξ −2 )

where the correlation length

12 −1/2 τ J ξ= kTc (−2τ )−1/2

(6.5.50 )

T > Tc T < Tc

(6.5.51)

6.5 Ferromagnetism

303

has been introduced, with τ = (T − Tc ) /Tc. The susceptibility in real space is obtained by inverting the Fourier transform: 1 V d3 q χ(q) eiq(xl −xl ) . χ(xl − xl ) = χ(q)eiq(xl −xl ) = N q N (2π)3 (6.5.52) For the second equals sign it was assumed that the system is macroscopic, so that the sum over q can be replaced by an integral (cf. (4.1.2b) and (4.1.14a) with p/ → q) . To compute the susceptibility for large distances it suﬃces to make use of the result for χ(q) at small values of q (Eq. (6.5.50 )); then with the lattice constant a we ﬁnd a3 e−|xl −xl |/ξ a3 eiq(xl −xl ) = . (6.5.53) d3 q χ (xl − xl ) = 3 2 −2 J (q + ξ ) 4πJ|xl − xl | (2π) From χ calculated in this way, we ﬁnd the correlation function via (6.5.37): G (x) = kT χ (x) =

kT a3 e−|x|/ξ , 4πJ |x|

(6.5.53 )

which in this context is called the Ornstein–Zernike correlation function. The Ornstein–Zernike correlation function and its Fourier transform are shown in Fig. 6.10 and Fig. 6.11 for the temperatures T = 1.01 Tc and T = Tc . In these ﬁgures, the correlation length ξ at T = 1.01 Tc is also indicated. The quantity 1/2 ξ0 is deﬁned by ξ0 = (J/kTc ) , according to (6.5.48). At large distances χ(x) 1 −|x|/ξ decreases exponentially as |x| e . The correlation length ξ characterizes

Fig. 6.10. The Ornstein–Zernike correlation function for T = 1.01 Tc and for T = Tc . Distances are measured in units of ξ0 = (J/kTc )1/2 .

Fig. 6.11. The Fourier transform of the Ornstein–Zernike susceptibility for T = 1.01 Tc and for T = Tc . The reciprocal of the correlation length for T = 1.01 Tc is indicated by the arrow.

304

6. Magnetism

the typical length over which the spin ﬂuctuations are correlated. For |x| ξ, G (x) is practically zero. At Tc , ξ = ∞, and G (x) obeys the power law G (x) =

kTc v 4πJ |x|

(6.5.54)

with the volume of the unit cell v = a3 . χ(q) varies as 1/q 2 for ξ −1 q and for q = 0, it is identical with the Curie–Weiss susceptibility. On approaching Tc , χ (0) becomes larger and larger. We note further that the continuum theory and thus (6.5.50 ) and (6.5.52) apply only to the case when |x| a. An important experimental tool for the investigation of magnetic phenomena is neutron scattering. The magnetic moment of the neutron interacts with the ﬁeld produced by the magnetic moments in the solid and is therefore sensitive to magnetic structure and to static and dynamic ﬂuctuations. The elastic scattering cross-section is proportional to the static susceptibility χ(q). Here, q is the momentum transfer, q = kin − kout , where kin(out) are the wave numbers of the incident and scattered neutrons. The increase of χ(q) at small q for T → Tc leads to intense forward scattering. This is termed critical opalescence near the Curie temperature, in analogy to the corresponding phenomenon in light scattering near the critical point of the gas-liquid transition. The correlation length ξ diverges at the critical point; the correlations become more and more long-ranged as Tc is approached. Therefore, statistical ﬂuctuations of the magnetic moments are correlated with each other over larger and larger regions. Furthermore, a ﬁeld acting at the position x induces a polarization not only at that position, but also up to a distance ξ, as a result of (6.5.37). The increase of the correlations can also be recognized in the spin conﬁgurations illustrated in Fig. 6.12. Here, ‘snapshots’ from a computer simulation of the Ising model are shown. White pixels represent σ = +1 and black pixels are for σ = −1. At twice the transition temperature, the spins are correlated only over very short distances (of a few lattice constants). At T = 1.1 Tc , the increase of the correlation length is clearly recognizable.

Fig. 6.12. A ‘snapshot’ of the spin conﬁguration of a two-dimensional Ising model at T = 2 Tc , T = 1.1 Tc and T = Tc . White pixels represent σ = +1, and black pixels refer to σ = −1.

6.5 Ferromagnetism

305

Along with very small clusters, both the black and the white clusters can be made out up to the correlation length ξ (T = 1.1 Tc). At T = Tc , ξ = ∞. In the ﬁgure, one sees two large white and black clusters. If the area viewed were to be enlarged, it would become clear that these are themselves located within an even larger cluster, which itself is only a member of a still larger cluster. There are thus correlated regions on all length scales. We observe here a scale invariance to which we shall return later. When we enlarge the unit of length, the larger clusters become smaller clusters, but since there are clusters up to inﬁnitely large dimensions, the picture remains the same. The Ornstein–Zernike theory (6.5.51) and (6.5.53) reproduces the correct behavior qualitatively. The correlation length diverges however in reality as ξ = ξ0 τ −ν , where in general ν = 12 , and also the shape of G(x) diﬀers from (6.5.53 ) (see Chap. 7). ∗

6.5.5 Continuum Representation

6.5.5.1 Correlation Functions and Susceptibilities It is instructive to derive the results obtained in the preceding sections in a continuum representation. The formulas which occur in this derivation will also allow a direct comparison with the Ginzburg–Landau theory, which we will treat later (in Chap. 7). Critical anomalies occur at large wavelengths. In order to describe this region, it is suﬃcient and expedient to go to a continuum formulation: hl → h (x) , σl → σ (x) , ml → m(x) , Z 3 X d x h (x) σ (x) . hl σl → v l

(6.5.55)

Here, a is the lattice constant and v = a3 is the volume of the unit cell. The sum over l becomes an integral over x in the limit v → ∞. The partial derivative becomes a functional derivative25 (v → 0) 1 ∂ml δm (x) = δh (x ) v ∂hl

etc., e.g.

´ ` δh (x) = δ x − x . δh (x )

(6.5.56)

For the susceptibility and correlation function we thus obtain from (6.5.34) ` ´ ´ ` δm (x) 1 ∂ml χ x − x = v = = G x − x . δh (x ) ∂hl kT For small h (x), we ﬁnd Z 3 ´ ` ´ d x ` m (x) = χ x − x h x . v 25

(6.5.57)

(6.5.58)

The general deﬁnition of the functional derivative is to be found in W. I. Smirnov, A Course of Higher Mathematics, Vol. V, Pergamon Press, Oxford 1964 or in QM I, Sect. 13.3.1

306

6. Magnetism

A periodic ﬁeld ` ´ h x = hq eiqx

(6.5.59)

induces a magnetization of the form Z 3 ´ d x ` χ x − x e−iq(x−x ) hq = χ (q) eiqx hq , m (x) = eiqx v

(6.5.60)

where Z χ (q) =

d3 y 1 χ (y) e−iqy = v kT v

Z

d3 y e−iqy G (y)

(6.5.61)

is the Fourier transform of the susceptibility, and after the second equals sign, we have made use of (6.5.37). In particular, for q = 0, we ﬁnd the following relation between the uniform susceptibility and the correlation function: Z 1 (6.5.62) d3 y G (y) . χ ≡ χ (0) = kT v

6.5.5.2 The Ornstein–Zernike Correlation Function As before, the ﬁeld h (x) and with it also the mean value σ(x) are position dependent. The density matrix in the molecular ﬁeld approximation and in the continuum representation is given by: «– » Z 3 „ Z 3 ´ ˙ ` ´¸ d x d x ` . (6.5.63) σ (x) h (x) + J x − x σ x ρMFT = Z −1 exp β v v The Fourier transform of the exchange coupling for small wavenumbers assumes the form Z 3 Z 3 d x d x 2 1 J˜ (k) = (6.5.64) J (x) e−ik·x ≈ J˜ − k2 x J(x) ≡ J˜ − k2 J , v 6 v where the exponential function has been replaced by its Taylor expansion. Owing to the spherical symmetry of Rthe exchange interactionR J(x) ≡ J(|x|), there is no linear term in k and we ﬁnd d3 x (kx)2 J (x) = 13 k2 d3 x x2 J (x). The constant R 3 1 d x x2 J (x). The inverse transform of (6.5.64) yields J is deﬁned by J = 6v ” “ (6.5.65) J (x) = v J˜ + J∇ 2 δ (x) . For phenomena at small k or large distances, the real position dependence of the exchange interaction can be replaced by (6.5.65). We insert this into (6.5.63) and obtain the mean value of σ (x), analogously to (6.5.14) in Sect. 6.5.2: ”i h “ . (6.5.66) σ (x) = tanh β h (x) + J˜ σ (x) + J∇2 σ(x) In the neighborhood of Tc , we can carry out an expansion similar to that in (6.5.16), m (x) ≡ σ (x), τ m (x) −

J 1 h (x) ∇2 m (x) + m (x)3 = , kTc 3 kTc

(6.5.67)

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

307

with τ = (T − Tc ) /Tc , where the second term on the left-hand side occurs due to the spatial inhomogeneity of the magnetization. The equations of the continuum limit can be obtained from the corresponding equations of the discrete representation at any step, e.g. (6.5.67) follows from ” by carrying out the substitutions “ (6.5.46), σl = ml → m (x) , J (l) → J (x) = J˜ + J∇2 δ (x). δ Now we take the functional derivative δh(x ) of the last equation, (6.5.67), » – ´ ` ´ ` J ∇2 + m20 χ x − x = vδ x − x /kTc . (6.5.68) τ− kTc

Since the susceptibility is calculated in the limit h → 0, the spontaneous magnetization m0 , which is given by the molecular-ﬁeld expressions (6.5.17a,b), appears on the left side. The solution of this diﬀerential equation, which also occurs in connection with the Yukawa potential, is given in three dimensions by ` ´ v e−|x−x |/ξ χ x − x = . 4πJ |x − x |

(6.5.69)

The Fourier transform is 1 χ (q) = . J (q 2 + ξ −2 )

(6.5.70)

In this expression, we have introduced the correlation length: «1/2 ( −1/2 „ τ T > Tc J ξ= −1/2 kTc (−2τ ) T < Tc .

(6.5.71)

The results thus obtained agree with those of the previous section; for their discussion, we refer to that section.

∗

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields 6.6.1 The Hamiltonian In this section, we investigate the inﬂuence of the dipole interaction. The total Hamiltonian for the magnetic moments µl is given by µl Ha . (6.6.1) H ≡ H0 ({µl }) + Hd ({µl }) − l

H0 contains the exchange interaction between the magnetic moments and Hd represents the dipole interaction 1 αβ α β A µ µ Hd = 2 ll l l l,l ) * (6.6.2) 3(xl − xl )α (xl − xl )β δαβ 1 α β µ = − µ , l l 5 2 |xl − xl |3 |xl − xl | l,l

and Ha is the externally applied magnetic ﬁeld. The dipole interaction is long-

308

6. Magnetism

ranged, in contrast to the exchange interaction; it decreases as the third power of the distance. Although the dipole interaction is in general considerably weaker than the exchange interaction – its interaction energy corresponds to a temperature of about26 1 K – it plays an important role for some phenomena due to its long range and also due to its anisotropy. The goal of this section is to obtain predictions about the free energy and its derivatives for the Hamiltonian (6.6.1), F (T, Ha ) = −kT log Tr e−H/kT

(6.6.3)

and to analyze the modiﬁcations which result from including the dipole interaction. Before we turn to the microscopic theory, we wish to derive some elementary consequences of classical magnetostatics for thermodynamics; their justiﬁcation within the framework of statistical mechanics will be given at the end of this section. 6.6.2 Thermodynamics and Magnetostatics 6.6.2.1 The Demagnetizing Field It is well known from electrodynamics27 (magnetostatics) that in a magnetized body, in addition to the externally applied ﬁeld Ha , there is a demagnetizing ﬁeld Hd which results from the dipole ﬁelds of the individual magnetic moments, so that the eﬀective ﬁeld in the interior of the magnet, Hi , Hi = Ha + Hd ,

(6.6.4a)

is in general diﬀerent from Ha . The ﬁeld Hd is uniform only in ellipsoids and their limiting shapes, and we will thus limit ourselves as usual to this type of bodies. For ellipsoids, the demagnetizing ﬁeld has the form Hd = −D M and thus the (macroscopic) ﬁeld in the interior of the body is Hi = H a − D M .

(6.6.4b)

Here, D is the demagnetizing tensor and M is the magnetization (per unit volume). When Ha is applied along one of the principal axes, D can be interpreted as the appropriate demagnetizing factor in Eq. (6.6.4b). For Ha and therefore M parallel to the axis of a long cylindrical body, D = 0; for Ha and M perpendicular to an inﬁnitely extended thin sheet, D = 4π; and for a sphere, D = 4π 3 . The value of the internal ﬁeld thus depends on the shape of the sample and the direction of the applied ﬁeld. 26

27

See e.g. the estimate in N. W. Ashcroft and N. D. Mermin, Solid State Physics, Holt, Rinehart and Winston, New York, 1976, p. 673. A. Sommerfeld, Electrodynamics, Academic Press, New York 1952; R. Becker and F. Sauter, Theorie der Elektrizit¨ at, Vol. 1, 21st Edition, p. 52, Teubner, Stuttgart, 1973; R. Becker, Electromagnetic Fields and Interactions, Blaisdell, 1964; J. D. Jackson, Classical Electrodynamics, 2nd edition, John Wiley, New York, 1975.

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

309

6.6.2.2 Magnetic Susceptibilities We now need to distinguish between the susceptibility relative to the applied ∂M ﬁeld, χa (Ha ) = ∂H , and the susceptibility relative to the internal ﬁeld, a ∂M χi (Hi ) = ∂Hi . We consider for the moment only ﬁelds in the direction of the principal axes, so that we do not need to take the tensor character of the susceptibilities into account. We emphasize that the usual deﬁnition in electrodynamics is the second one. This is due to the fact that χi (Hi ) is a pure materials property28 , and that owing to curl Hi = 4π c j, the ﬁeld Hi can be controlled in the core of a coil by varying the current density j. Taking the derivative of Eq. (6.6.4b) with respect to M , one obtains the relation between the two susceptibilities: 1 1 = −D . χi (Hi ) χa (Ha )

(6.6.5a)

It is physically clear that the susceptibility χi (Hi ) relative to the internal ﬁeld Hi acting in the interior of the body is a speciﬁc materials parameter which is independent of the shape, and that therefore the shape dependence of χa (Ha ) χa (Ha ) =

χi (Hi ) 1 + Dχi (Hi )

(6.6.5b)

results form the occurrence of D in (6.6.5b) and (6.6.4b).29 If the ﬁeld is not applied along one of the principal axes of the ellipsoid, one can derive the tensor relation by taking the derivative of the component α of (6.6.4b) with respect to Mβ : −1 χi αβ = χ−1 − Dαβ . (6.6.5c) a αβ Relations of the type (6.6.5a–c) can be found in the classical thermodynamic literature.30

28

29

30

In the literature on magnetism, χi (Hi ) is called the true susceptibility and χa (Ha ) the apparent susceptibility. E. Kneller, Ferromagnetismus, Springer, Berlin, 1962, p. 97. When χi 10−4 , as in many practical situations, the demagnetization correction can be neglected. On the other hand, there are also cases in which the shape of the object can become important. In paramagnetic salts, χi increases at low temperatures according to Curie’s law, and it can become of the order of 1; in superconductors, 4πχi = −1 (perfect diamagnetism or Meissner eﬀect). R. Becker and W. D¨ oring, Ferromagnetismus, Springer, Berlin, 1939, p. 8; A. B. Pippard, Elements of Classical Thermodynamics, Cambridge at the University Press 1964, p. 66.

310

6. Magnetism

6.6.2.3 Free Energies and Speciﬁc Heats Starting from the free energy F (T, Ha ) with the diﬀerential dF = −SdT − V MdHa ,

(6.6.6)

we can deﬁne a new free energy by means of a Legendre transformation V Fˆ (T, Hi ) = F (T, Ha ) + M α Dαβ M β . 2 The diﬀerential of this free energy is, using (6.6.4b), given by dFˆ (T, Hi ) = −SdT − V MdHi .

(6.6.7a)

(6.6.7b)

Since the entropy S(T, Hi ) and the magnetization M(T, Hi ) as functions of the internal ﬁeld must be independent of the shape of the sample, all the derivatives of Fˆ (T, Hi ) are shape independent. Therefore, the free energy Fˆ (T, Hi ) is itself shape independent. From(6.6.6) and (6.6.7b), it follows that

ˆ ∂F ∂F S=− =− (6.6.8) ∂T Ha ∂T Hi and 1 M =− V

∂F ∂Ha

T

1 =− V

∂ Fˆ ∂Hi

.

(6.6.9)

T

The speciﬁc heat can also be deﬁned for a constant internal ﬁeld „ « T ∂S CH i = V ∂T Hi and for a constant applied (external) ﬁeld „ « T ∂S . CHa = V ∂T Ha

(6.6.10a)

(6.6.10b)

Using the Jacobian as in Sect. 3.2.4, one can readily obtain the following relations CH a = CH i

1 1 + DχiT ` ∂M ´

CHi = CHa + T and

∂T

Ha

D

∂T

Ha

(6.6.11b)

1 − DχaT ` ∂M ´

CH a = C H i − T

(6.6.11a) ` ∂M ´

∂T

Hi

D

` ∂M ´ ∂T

1 + DχiT

Hi

,

(6.6.11c)

where the index T indicates the isothermal susceptibility. The shape independence of χi (Hi ) and Fˆ (T, Hi ), which is plausible for the physical reasons given above, has also been derived using perturbation-theoretical methods.31 For a vanishingly small ﬁeld, the shape-independence could be proven without resorting to perturbation theory.32 31 32

P. M. Levy, Phys. Rev. 170, 595 (1968); H. Horner, Phys. Rev. 172, 535 (1968) R. B. Griﬃths, Phys. Rev.176, 655 (1968)

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

311

6.6.2.4 The Local Field Along with the internal ﬁeld, one occasionally also requires the local ﬁeld Hloc . It is the ﬁeld present at the position of a magnetic moment. One obtains it by imagining a sphere to be centered on the lattice site under consideration, which is large compared to the unit cell but small compared to the overall ellipsoid (see Fig. 6.13). We obtain for the local ﬁeld:27 Hloc = Ha + φM 4π −D . with φ = φ0 + 3

(6.6.12a) (6.6.12b)

Here, φ0 is the sum of the dipole ﬁelds of the average moments within the ﬁctitious sphere. The medium outside the imaginary sphere can be treated as a continuum, and its contribution is that of a solid polarized ellipsoid (−D), minus that of a polarized sphere 4π 3 . For a cubic lattice, φ0 vanishes for reasons of symmetry.27 One can also introduce a free energy 1 ˆ Fˆ (T, Hloc ) = F (T, Ha ) − V M φM 2

(6.6.13a)

with the diﬀerential ˆ dFˆ = −SdT − V M dHloc . Since, owing to (6.6.12a,b), (6.6.7a), and (6.6.13a), it follows that

1 4π ˆ M, Fˆ (T, Hloc ) = Fˆ (T, Hi ) + V M φ0 + 2 3

(6.6.13b)

(6.6.14)

ˆ so that Fˆ diﬀers from Fˆ only by a term which is independent of the external shape and is itself therefore shape-independent. One can naturally also

Fig. 6.13. The deﬁnition of the local ﬁeld. An ellipsoid of volume V and a ﬁctitious sphere of volume V0 (schematic, not to scale)

312

6. Magnetism

deﬁne susceptibilities at constant Hloc and derive relations corresponding to equations (6.6.13a-c) and (6.6.11a-c), in which essentially Hi is replaced by Hloc and D by φ. 6.6.3 Statistical–Mechanical Justiﬁcation In this subsection, we will give a microscopic justiﬁcation of the thermodynamic results obtained in the preceding section and derive Hamiltonians for the calculation of the shape-independent free energies Fˆ (T, Hi ) and ˆ Fˆ (T, Hloc ) of equations (6.6.7a) and (6.6.13a). The magnetic moments will be represented by the their mean values and ﬂuctuations. The dipole interaction will be decomposed into a short-range and a long-range part. For the interactions of the ﬂuctuations, the long-range part can be neglected. The starting point will be the Hamiltonian (6.6.1), in which we introduced the ﬂuctuations around (deviations from) the mean value µα l α α δµα l ≡ µl − µl :

(6.6.15)

1 αβ α β 1 αβ α β A δµl δµl + A µl µl 2 ll 2 ll l,l l,l αβ β α α All δµα µl Ha + l µl −

H = H0 ({µl }) +

l,l

= H0 ({µl }) + −

1 2

α µα l (Ha

l β α Aαβ ll δµl δµl −

l,l

+

1 αβ α β A µl µl 2 ll l,l

α Hd,l )

(6.6.16)

l

with the thermal average of the ﬁeld at the lattice point l due to the remaining dipoles: αβ β α (6.6.17) =− All µl . Hd,l l

For V βin an external magnetic ﬁeld, the magnetization is uniform, βellipsoids M ; likewise the dipole ﬁeld (demagnetizing ﬁeld): µl = N α α = Hloc ≡ (φ0 + D0 − D)αβ M β . Hd,l

(6.6.18)

In going from (6.6.17) to (6.6.18), the dipole sum φαβ = −

V αβ A = (φ0 + D0 − D)αβ N ll l

(6.6.19)

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

313

was decomposed into a discrete sum over the subvolume V0 (the Lorentz sphere) and the region V − V0 , in which a continuum approximation can be applied: ∂ ∂ 1 (D0 − D)αβ = − d3 x ∂x α ∂xβ |x| V −V0

(6.6.20) ∂ 1 ∂ 1 − . = δαβ dfα dfα ∂xβ |x| ∂xβ |x| S1 S2 The ﬁrst surface integral extends over the surface of the Lorentz sphere and the second over the (external) surface of the ellipsoid (sample). With this, we can write the Hamiltonian in the form 1 αβ α β α α 1 H = H0 ({µl }) + A δµl δµl − µl Hloc + V Mα φαβ Mβ . 2 ll 2 l,l

l

(6.6.21) Since the long-range property of the dipole interaction plays no role in the interaction between the ﬂuctuations δµl , the ﬁrst two terms in the Hamiltonian are shape-independent. The sample shape enters only in the local ﬁeld Hloc and in the fourth term on the right-hand side. Comparison with (6.6.13a) ˆ shows that the free energy Fˆ (T, Hloc ), which, apart from its dependence on Hloc , is shape independent, can be determined by computation of the partition function with the ﬁrst three terms of (6.6.21). If the dipole interactions between the ﬂuctuations is completely neglected,33 one obtains the approximate eﬀective Hamiltonian X ˆ ˆ = H 0 ({µ }) − µl Hloc , (6.6.22) H l l

in which the dipole interaction expresses itself only in the demagnetizing ﬁeld.

β α The exact treatment of the second term, 12 l,l Aαβ ll δµl δµl in (6.6.21) is carried out as follows: since the expectation value based on approximate −r /ξ application of the Ornstein–Zernike theory decreases as δµl δµl ≈ e rll , and All ∼ r13 , the interaction of the ﬂuctuations is negligible at large disll tances. The shape of the sample thus plays no role in this term in the limit V → ∞ with the shape kept unchanged. One can thus replace Aαβ ll by σ αβ All

=

∂ ∂ e−σ|x| , ∂xα ∂xβ |x|

(6.6.23)

with the cutoﬀ length σ −1 , or more precisely 1 αβ α β 1 σ αβ α β All δµl δµl = lim lim All δµl δµl . σ→0 V →∞ 2 2 l,l

33

l,l

J. H. van Vleck, J. Chem. Phys. 5, 320, (1937), Eq. (36).

(6.6.24)

314

6. Magnetism

Inserting δµl = µl − µl , we obtain for the right-hand side of (6.6.24) β α β 1 σ αβ α β µl All µl µl − 2µα l µl + µl σ→0 V →∞ 2 lim lim

l,l

1 σ αβ α β All µl µl + σ→0 V →∞ 2

= lim lim +

(6.6.25)

l,l

(φ0 + D0 )αβ M β µα l −

l

V (φ0 + D0 )M 2 . 2

In the order: ﬁrst the thermodynamic limit V → ∞, then σ → 0, the ﬁrst term in (6.6.25) is shape-independent. Since in the second and third terms, the sum over l is cut oﬀ by e−|xl −xl |σ , the contribution −D due to the external boundary of the ellipsoid does not appear here. Inserting (6.6.24) and (6.6.25) into (6.6.21), we ﬁnd the Hamiltonian in ﬁnal form34 ˆ − H=H

V M DM 2

(6.6.26a)

with ˆ = H0 ({µ }) + H l

d3 q

αβ α β 3 va Aq µq µ−q

(2π)

−

α µα l Hi .

(6.6.26b)

l

Here, the Fourier transforms 1 −iqxl α e µl , µα q = √ N l Aαβ e−iq(xl −xl ) Aαβ q = l0

(6.6.27a) (6.6.27b)

l =0

and the internal ﬁeld Hi = Ha − DM have been introduced. The Fourier transform (6.6.27b) can be evaluated using the Ewald method 35 ; for cubic lattices, it yields36

1 4π 3q α q β αβ αβ α β 2 α 2 δ − + α δ αβ Aq = q q + α q − α (q ) 1 2 3 va 3 q2 2 4 2 , (6.6.27b ) + O q 4 , (q α ) , (q α ) (q β ) where va is the volume of the primitive unit cell and the αi are constants ˆ Eq. (6.6.26b), which depend on the lattice structure. The ﬁrst two terms in H, 34 35 36

See also W. Finger, Physica 90 B, 251 (1977). P. P. Ewald, Ann. Phys. 54, 57 (1917); ibid., 54, 519 (1917); ibid., 64, 253 (1921) M. H. Cohen and F. Keﬀer, Phys. Rev. 99, 1135 (1955); A. Aharony and M. E. Fisher, Phys. Rev. B 8, 3323 (1973)

6.6 The Dipole Interaction, Shape Dependence, Internal and External Fields

315

are shape-independent. The sample shape enters only into the internal ﬁeld Hi and in the last term of (6.6.26a). Comparison of Eq. (6.6.26a) with Eq. (6.6.7a) shows that the shape-independent free energy Fˆ (T, Hi ) can ˆ be calculated from the partition function derived from the Hamiltonian H, Eq. (6.6.26b). We note in particular the nonanalytic behavior of the term qα qβ /q 2 in the limit q → 0; it is caused by the 1/r3 -dependence of the dipole interaction. Due to this term, the longitudinal and transverse wavenumberdependent susceptibilities (with respect to the wavevector) are diﬀerent from each other.37 We recall that the short-ranged exchange interaction can be expanded as a Taylor series in q: 1 ˜ d3 q J(q)µ H0 = − q µ−q 2 (6.6.28) ˜ J(q) = J˜ − Jq2 + O(q 4 ) . In addition to the eﬀects of the demagnetizing ﬁeld and the resulting shape dependence, which we have treated in detail, the dipole interaction, even though it is in general much weaker than the exchange interaction, has a number of important consequences owing to its long range and its anisotropic character:37 (i) It changes the values of the critical exponents in the neighborhood of ferromagnetic phase transitions; (ii) it can stabilize magnetic order in systems of low dimensionality, which otherwise would not occur due to the large thermal ﬂuctuations; (iii) the total magnetic moment µ = l µl is no longer conserved. This has important consequences for the dynamics; and (iv) the dipole interaction is important in nuclear magnetism, where it is larger than or comparable to the indirect exchange interaction. We can now include the dipole interactions in the results of Sects. 6.1 to 6.5 in the following manner: (i) If we neglect the dipole interaction between the ﬂuctuations of the magnetic moments δµl = µl − µl as an approximation, we can take the spatially uniform part of the dipole ﬁelds into account by replacing the ﬁeld H by the local ﬁeld Hloc . (ii) If, in addition to the exchange interactions possibly present, we also include the dipole interaction between the ﬂuctuations, then according to (6.6.26), the complete Hamiltonian contains the internal ﬁeld Hi . The ﬁeld H must therefore be replaced by Hi ; furthermore, the shapedependent term − V2 M DM enters into the Hamiltonian H, Eq. (6.6.26a), ˆ also the shape-independent part of the dipole inand, via the term H, teraction, i.e. Eq. (6.6.27b ).

37

E. Frey and F. Schwabl, Advances in Physics 43, 577 (1994)

316

6. Magnetism

6.6.4 Domains The spontaneous magnetization per spin, m0 (T ), is shown in Fig. 6.5. The total magnetic moment of a uniformly magnetized sample without an external ﬁeld would be N m0 (T ), and its spontaneous magnetization per unit volume M0 (T ) = N m0 (T )/V , where N is the overall number of magnetic moments. In fact, as a rule the magnetic moment is smaller or even zero. This results from the fact that a sample in general breaks up into domains with diﬀerent directions of magnetization. Within each domain, |M(x, T )| = M0 (T ). Only when an external ﬁeld is applied do the domains which are oriented parallel to the ﬁeld direction grow at the cost of the others, and reorientation occurs until ﬁnally N m0 (T ) has been reached. The spontaneous magnetization is therefore also called the saturation magnetization. We want to illustrate domain formation, making use of two examples. (i) One possible domain structure in a ferromagnetic bar below Tc is shown in Fig. 6.14. One readily sees that for the conﬁguration with 45◦ -walls throughout the sample, div M = 0 .

(6.6.29)

Then it follows from the basic equations of magnetostatics div Hi = −4π div M curl Hi = 0

(6.6.30a) (6.6.30b)

that, in the interior of the sample, Hi = 0 ,

(6.6.31)

and thus also B = 4πM in the interior. From the continuity conditions it follows that B = H = 0 outside the sample. The domain conﬁguration is therefore energetically more favorable than a uniformly magnetized sample. (ii) Domain structures also express themselves in a measurement of the total magnetic moment M of a sphere. The calculated magnetization M = M V as a function of the applied ﬁeld is indicated by the curves in Fig. 6.15.

450

Fig. 6.14. The domain structure in a prism-shaped sample

6.7 Applications to Related Phenomena

317

Fig. 6.15. The magnetization within a sphere as a function of the external ﬁeld Ha , T1 < T2 < Tc ; D is the demagnetizing factor.

Let the magnetization within a uniformly magnetized region as a function of the internal ﬁeld Hi = Ha − DM be given by the function M = M (Hi ). As long as the overall magnetization of the sphere is less than the saturation magnetization, the domains have a structure such that Hi = 0, and 1 therefore, M = D Ha must hold.38 For Ha = DMspont , the sample is ﬁnally uniformly magnetized, corresponding to the saturation magnetization. For Ha > DMspont , M can be calculated from M = M (Ha − DM ).

6.7 Applications to Related Phenomena In this section, we discuss consequences of the results of this chapter on magnetism for other areas of physics: polymer physics, negative temperatures and the melting curve of 3 He. 6.7.1 Polymers and Rubber-like Elasticity Polymers are long chain molecules which are built up of similar links, the monomers. The number of monomers is typically N ≈ 100, 000 . Examples of polymers are polyethylene, (CH2 )N , polystyrene, (C8 H8 )N , and rubber, (C5 H8 )N , where the number of monomers is N > 100, 000 (see Fig. 6.16).

Fig. 6.16. The structures of polyethylene and polystyrene

38

S. Arajs and R. V Calvin, J. Appl. Phys. 35, 2424 (1964).

318

6. Magnetism

To ﬁnd a description of the mechanical and thermal properties we set up the following simple model (see Fig. 6.17): the starting point in space of monomer 1 is denoted by X1 , and that of a general monomer i by Xi . The position (orientation) of the ith monomer is then given by the vector Si ≡ Xi+1 − Xi : S1 = X2 − X1 , . . . , Si = Xi+1 − Xi , . . . , SN = XN +1 − XN .

(6.7.1)

We now assume that aside from the chain linkage of the monomers there are no interactions at all between them, and that they can freely assume any arbitrary orientation, i.e. < Si · Sj >= 0 for i = j. The length of a monomer is denoted by a, i.e. S2i = a2 .

Fig. 6.17. A polymer, composed of a chain of monomers

Since the line connecting the two ends of the polymer can be represented in the form XN +1 − X1 =

N

Si ,

(6.7.2)

i=1

it follows that XN +1 − X1 = 0 .

(6.7.3)

Here, we average independently over all possible orientations of the Si . The last equation means that the coiled polymer chain is oriented randomly in space, but makes no statement about its typical dimensions. A suitable measure of the mean square length is 2 2 (XN +1 − X1 ) = = a2 N . Si (6.7.4) We deﬁne the so called radius of gyration 1 2 (XN +1 − X1 ) = aN 2 , R≡

(6.7.5)

which characterizes the size of the polymer coil that grows as the square root of the number of monomers.

6.7 Applications to Related Phenomena

319

In order to study the elastic properties, we allow a force to act on the ends of the polymer, i.e. the force F acts on XN +1 and the force −F on X1 (see Fig. 6.17). Under the inﬂuence of this tensile force, the energy depends on the positions of the two ends: H = −(XN +1 − X1 ) · F = − [(XN +1 − XN ) + (XN − XN −1 ) + . . . + (X2 − X1 )] · F = −F ·

N

Si .

(6.7.6)

i=1

Polymers under tension can therefore be mapped onto the problem of a paramagnet in a magnetic ﬁeld, Sect. 6.3. The force corresponds to the applied magnetic ﬁeld in the paramagnetic case, and the length of the polymer chain to the magnetization. Thus, the thermal average of the distance vector between the ends of the chain is L=

N

Si

i=1

kT F aF − . = N a coth kT aF F

(6.7.7)

We have used the Langevin function for classical moments in this expression, Eq. (6.3.12b), and multiplied by the unit vector in the direction of the force, F/F . If aF is small compared to kT , we ﬁnd (corresponding to Curie’s law) L=

N a2 F. 3kT

(6.7.8)

For the change in the length, we obtain from the previous equation ∂L 1 ∼ ∂F T

(6.7.9a)

∂L N a2 =− |F| . ∂T 3kT 2

(6.7.9b)

and

The length change per unit force or the elastic constant decreases with increasing temperature according to (6.7.9a). A still more spectacular result is ∂L that for the expansion coeﬃcient ∂T : rubber contracts when its temperature is increased! This is in complete contrast to crystals, which as a rule expand with increasing temperature. The reason for the elastic behavior of rubber is easy to see: the higher the temperature, the more dominant is the entropy term in the free energy, F = E − T S, which strives towards a minimum. The entropy increases, i.e. the polymer becomes increasingly disordered or coiled and therefore pulls together. The general dependence of the length on a|F|/kT is shown in Fig. 6.18.

320

6. Magnetism

Fig. 6.18. The length of a polymer under the inﬂuence of a tensile force F.

Remark: In the model considered here, we have not taken into account that a monomer has a limited freedom of orientation, since each position can be occupied by at most one monomer. In a theory which takes this eﬀect into account, the dependence R = aN 1/2 in Eq. (6.7.5) is replaced by R = aN ν . The exponent ν has a signiﬁcance analogous to that of the exponent of the correlation length in phase transitions, and the degree of polymerization (chain length) N corresponds to the reciprocal distance from the critical point, τ −1 . The properties of polymers, in which the volume already occupied is excluded, correspond to a random motion in which the path cannot lead to a point already passed through (self-avoiding random walk). The properties of both these phenomena follow from the n-component φ4 model (see Sect. 7.4.5) in the limit n → 0.39 An approximate formula for ν is due to Flory: νFlory = 3/(d + 2). 6.7.2 Negative Temperatures In isolated systems whose energy levels are bounded above and below, thermodynamic states with negative absolute temperatures can be established. Examples of such systems with energy levels that are bounded towards higher energies are two-level systems or paramagnets in an external magnetic ﬁeld h. We consider a paramagnet consisting of N spins of quantum number S = 1/2 with an applied ﬁeld along the z direction. Considering the quantum numbers of the Pauli spin matrices σl = ±1, the Hamiltonian has the following diagonal structure H = −h σl . (6.7.10) l

The magnetization per lattice site is deﬁned by m = σ and is independent of the lattice position l. The entropy is given by 39

P.-G. de Gennes, Scaling Concepts in Polymer Physics, Cornell University Press, Ithaca, 1979.

6.7 Applications to Related Phenomena

321

1+m 1−m 1−m 1+m log + log S(m) = −kN 2 2 2 2 N+ N− + N− log , = −k N+ log N N

(6.7.11)

and the internal energy E depends on the magnetization via E = −N hm = −h(N+ − N− ) ,

(6.7.12)

with N± = N (1 ± m)/2. These expressions follow immediately from the treatment in the microcanonical ensemble (Sect. 2.5.2.2) and can also be obtained from Sect. 6.3 by elimination of T and B. For m = 1 (all spins parallel to the ﬁeld h), the energy is E = −N h; for m = −1 (all spins antiparallel to h), the energy is E = N h. The entropy is given in Fig. 2.9 as a function of the energy. It is maximal for E = 0, i.e. in the state of complete disorder. The temperature is obtained by taking the derivative of the entropy with respect to the energy: 1 1+m 2h T = ∂S = log k 1 −m ∂E h

−1

.

(6.7.13)

It is shown as a function of the energy in Fig. 2.10. In the interval 0 < m ≤ 1, i.e. −1 ≤ E/N h < 0, the temperature is positive, as usual. For m < 0, that is when the magnetization is oriented antiparallel to the magnetic ﬁeld, the absolute temperature becomes negative, i.e. T < 0 ! With increasing energy, the temperature T goes from 0 to ∞, then through −∞, and ﬁnally to −0. Negative temperatures thus belong to higher energies, and are therefore “hotter” than positive temperatures. In a state with a negative temperature, more spins are in the excited state than in the ground state. One can also see that negative temperatures are in fact hotter than positive by bringing two such systems into thermal contact. Take system 1 to have the positive temperature T1 > 0 and system 2 the negative temperature T2 < 0. We assume that the exchange of energy takes place quasistatically; then the total entropy is S = S1 (E1 ) + S2 (E2 ) and the (constant) total energy is E = dE1 2 E1 + E2 . From the increase of entropy, it follows with dE dt = − dt that

∂S1 dE1 ∂S2 dE2 1 1 dE1 dS = + = . (6.7.14) − 0< dt ∂E1 dt ∂E2 dt T1 T2 dt 1 Since the factor in brackets, T11 + |T12 | , is positive, dE dt > 0 must also hold; this means that energy ﬂows from subsystem 2 at a negative temperature into subsystem 1. We emphasize that the energy dependence of S(E) represented in Fig. 2.9 and the negative temperatures which result from it are a direct consequence of the boundedness of the energy levels. If the energy levels were not bounded

322

6. Magnetism

from above, then a ﬁnite energy input could not lead to an inﬁnite temperature or even beyond it. We also note that the speciﬁc heat per lattice site of this spin system is given by C = Nk

2h kT

2

e2h/kT 1 + e2h/kT

2

(6.7.15)

and vanishes both at T = ±0 as well as at T = ±∞. We now discuss two examples of negative temperatures: (i) Nuclear spins in a magnetic ﬁeld: The ﬁrst experiment of this kind was carried out by Purcell and Pound40 in a nuclear magnetic resonance experiment using the nuclear spins of 7 Li in LiF. The spins were ﬁrst oriented at the temperature T by the ﬁeld H. Then the direction of H was so quickly reversed that the nuclear spins could not follow it, that is faster than a period of the nuclear spin precession. The spins are then in a state with the negative temperature −T . The mutual interaction of the spins is characterized by their spin-spin relaxation time of 10−5 − 10−6 sec. This interaction is important, since it allows the spin system to reach internal equilibrium; it is however negligible for the energy levels in comparison to the Zeeman energy. For nuclear spins, the interaction with the lattice in this material is so slow (the spin-lattice relaxation time is 1 to 10 min) that the spin system can be regarded as completely isolated for times in the range of seconds. The state of negative temperature is maintained for some minutes, until the magnetization reverses through interactions with the lattice and the temperature returns to its initial value of T . In dilute gases, a state of spin inversion with a lifetime of days can be established. (ii) Lasers (pulsed lasers, ruby lasers): By means of irradiation with light, the atoms of the laser medium are excited (Fig. 6.19). The excited electron drops into a metastable state. When more

Fig. 6.19. Examples of negative temperatures: (a) nuclear spins in a magnetic ﬁeld H, which is rotated by 180◦ (b) a ruby laser. The “pump” raises electrons into an excited state. The electron can fall into a metastable state by emission of a photon. When a population inversion is established, the temperature is negative 40

E. M. Purcell and R. V. Pound, Phys. Rev. 81, 279 (1951)

6.7 Applications to Related Phenomena

323

electrons are in this excited state than in the ground state, i.e. when a population inversion has been established, the state is described by a negative temperature. ∗

6.7.3 The Melting Curve of 3 He

The anomalous behavior of the melting curve of 3 He (Fig. 6.20) is related to the magnetic properties of solid 3 He.41 As we already discussed in connection

Fig. 6.20. The melting curve of 3 He at low temperatures

with the Clausius–Clapeyron equation, dP S S − SL = VS . This leads to the Pomeranchuk eﬀect, already mentioned in Sect. 3.8.2. The above estimate of Tmin yields a value which is a exp factor of 2 smaller than the experimental result, Tmin = 0.3 K. This results from the value of SL , which is too large. Compared to an ideal gas, there are correlations in an interacting Fermi liquid which, as can be understood intuitively, lead to a lowering of its entropy and to a larger value of Tmin .

Before the discovery of the two superﬂuid phases of 3 He, the existence of a maximum in the melting curve below 10−3 K was theoretically discussed.41 It was expected due to the T 3 -dependence of the speciﬁc heat in the antiferromagnetically ordered phase and the linear speciﬁc heat of the Fermi liquid. This picture however changed with the discovery of the superﬂuid phases of 3 He (see Fig. 4.10). The speciﬁc heat of the liquid behaves at low temperatures like e−∆/kT , with a constant ∆ (energy gap), and therefore the melting curve rises for T → 0 and has the slope 0 at T = 0.

Literature A.I. Akhiezer, V.G. Bar’yakhtar and S.V. Peletminskii, Spin Waves, North Holland, Amsterdam, 1968 N.W. Ashcroft and N.D. Mermin, Solid State Physics, Holt, Rinehart and Winston, New York, 1976 R. Becker u. W. D¨ oring, Ferromagnetismus, Springer, Berlin, 1939 W.F. Brown, Magnetostatic Principles in Ferromagnetism, North Holland, Amsterdam, 1962 F. Keﬀer, Spin Waves, Encyclopedia of Physics, Vol. XVIII/2, p. 1. Ferromagnetism, ed. S. Fl¨ ugge (Springer, Berlin, Heidelberg, New York 1966) Ch. Kittel, Introduction to Solid State Physics, 3rd ed., John Wiley, 1967 Ch. Kittel, Thermal Physics, John Wiley, New York, 1969 D.C. Mattis, The Theory of Magnetism, Harper and Row, New York, 1965 A.B. Pippard, Elements of Classical Thermodynamics, Cambridge at the University Press, 1964 H.E. Stanley, Introduction to Phase Transitions and Critical Phenomena, Clarendon Press, Oxford, 1971

Problems for Chapter 6

325

J.H. van Vleck, The Theory of Magnetic and Electric Susceptibilites, Oxford University Press, 1932 D. Wagner, Introduction to the Theory of Magnetism, Pergamon Press, Oxford, 1972

Problems for Chapter 6 6.1 Derive (6.1.24c) for the Hamiltonian of (6.1.25), by taking the second derivative of A(T, M ) = −kT log Tr e−βH + HM with respect to T for ﬁxed M .

6.2 The classical paramagnet: Consider a system of N non-interacting, classical p magnetic moments, µi ( µ2i = m) in a magnetic ﬁeld H, with the Hamiltonian P H =− N i=1 µi H . Calculate the classical partition function, the free energy, the entropy, the magnetization, and the isothermal susceptibility. Refer to the suggestions following Eq. (6.3.12b). 6.3 The quantum-mechanical paramagnet, in analogy to the main text: (a) Calculate the entropy and the internal energy of an ideal paramagnet as a function of T . Show that for T → ∞, S = N k ln (2J + 1) , and discuss the temperature dependence in the vicinity of T = 0. (b) Compute the heat capacities CH and CM for a non-interacting spin-1/2 system.

6.4 The susceptibility and mean square deviation of harmonic oscillators: Consider a quantum-mechanical harmonic oscillator with a charge e in an electric ﬁeld E H=

mω 2 2 p2 + x − eEx . 2m 2

Show that the dielectric susceptibility is given by χ=

∂ex e2 = ∂E mω 2

and that the mean square deviation takes the form ˙ 2¸ x =

βω coth , 2ωm 2

from which it follows that χ=

˙ 2¸ 2 tanh βω 2 x . ω

Compare these results with the paramagnetism of non-coupled magnetic moments! Take account of the diﬀerence between rigid moments and induced moments, and the resulting diﬀerent temperature dependences of the susceptibility. Take the classical limit βω 1.

326

6. Magnetism

6.5 Consider a solid with N degrees of freedom, which are each characterized by two energy levels at ∆ and −∆. Show that „ «2 dE ∆ 1 ∆ , C= = Nk E = −N ∆ tanh ∆ kT dT kT cosh2 kT holds. How does the speciﬁc heat behave for T ∆/k and for T ∆/k?

6.6 When the system described in 6.5 is disordered, so that all values of ∆ within

the interval 0 ≤ ∆ ≤ ∆0 occur with equal probabilities, show that then the speciﬁc heat for kT ∆0 is proportional to T . Hint: The internal energy of this system can be found from problem 6.5 by averaging over all values of ∆. This serves as a model for the linear speciﬁc heat of glasses at low temperatures.

6.7 Demonstrate the validity of the ﬂuctuation-response theorem, Eq. (6.5.35). 6.8 Two defects are introduced into a ferromagnet at the sites x1 and x2 , and produce there the magnetic ﬁelds h1 and h2 . Calculate the interaction energy of these defects for |x1 − x2 | > ξ. For which signs of the hi is there an attractive interaction of the defects? Suggestions: The energy in the molecular ﬁeld approximation is ¯ = P Sl Sl J(l − l ). E l,l For each individual defect, Sl 1,2 = G (xl − x1,2 ) h1,2 , where G is the OrnsteinZernike correlation function. For two defects which are at a considerable distance apart, Sl can be approximated as a linear superposition of the single-defect av¯ for this linear erages. The interaction energy can be obtained by calculating E superposition and subtracting the energies of the single defects.

6.9 The one-dimensional Ising model: Calculate the partition function ZN for a one-dimensional Ising model with N spins obeying the Hamiltonian H=−

N−1 X

Ji Si Si+1 .

i=1

Hint: Prove the recursion relation ZN+1 = 2ZN cosh (JN /kT ).

6.10 (a) Calculate the two-spin correlation function Gi,n := Si Si+n for the onedimensional Ising model in problem 6.9. Hint: The correlation function can be found by taking the appropriate derivatives of the partition function with respect to the interactions. Observe that Si2 = 1. Result: Gi,n = tanhn (J/kT ) for Ji = J. (b) Determine the behavior of the correlation length deﬁned by Gi,n = e−n/ξ for T → 0. (c) Calculate the susceptibility from the ﬂuctuation-response theorem: χ=

N N (gµB )2 X X Si Sj . kT i j

Hint: Consider how many terms with |i − j| = 0, |i − j| = 1, |i − j| = 2 etc. occur in the double sum. Compute the geometric series which appear.

Problems for Chapter 6 Result: (gµB )2 χ= kT

(

„ N

1+α 1−α

« −

´) ` 2α 1 − αN (1 − α)

2

; α = tanh

327

J . kT

(d) Show that in the thermodynamic limit, (N → ∞) χ ∝ ξ for T → 0, and thus γ/ν = 1. (e) Plot χ−1 in the thermodynamic limit as a function of temperature. (f ) How can one obtain from this the susceptibility of an antiferromagneticallycoupled linear chain? Plot and discuss χ as a function of temperature.

6.11 Show that in the molecular-ﬁeld approximation for the Ising model, the internal energy E is given by « „ 1 E = − kTc m2 − hm N 2 and the entropy S by » – ´ ` Tc 1 S = kN − m2 − . hm + log 2 cosh(kTc m + h)/kT T kT Inserting the equation of state, show also that « „ 1+m 1−m 1−m 1+m log + log . S = −kN 2 2 2 2 Finally, expand a(T, m) = e − T s + mh up to the 4th power in m.

6.12 An improvement of the molecular ﬁeld theory for an Ising spin system can be obtained as follows (Bethe–Peierls approximation): the interaction of a spin σ0 with its z neighbors is treated exactly. The remaining interactions are taken into account by means of a molecular ﬁeld h , which acts only on the z neighbors. The Hamiltonian is then given by: H = −h

z X j=1

σj − J

z X

σ0 σj − hσ0 .

j=1

The applied ﬁeld h acts directly on the central spin and is likewise included in h . H is determined self-consistently from the condition σ0 = σj . (a) Show that the partition function Z (h , T ) has the form «–z «–z » „ » „ h h J J e−h/kT + 2 cosh eh/kT Z = 2 cosh + − kT kT kT kT = Z+ + Z− . (b) Calculate the average values σ0 and σj for simplicity with h = 0. Result: σ0 = (Z+ − Z− ) /Z , z 1X 1 ∂ σj = ` h ´ log Z = σj = z j=1 z ∂ kT » « «– „ „ 1 h h J J Z+ tanh + Z− tanh . = − − z kT kT kT kT

328

6. Magnetism

(c) The equation σ0 = σj has a nonzero solution below Tc : “ ” J h cosh kT + kT 1 h = log ´ . `J h kT (z − 1) 2 cosh kT − kT Determine Tc and h by expanding the equation in terms of Result:

h kT

.

J tanh = 1/ (z − 1) kTc j ﬀ „ «2 cosh3 (J/kT ) J 1 h =3 tanh − + ... . kT sinh (J/kT ) kT z−1

6.13 In the so called Weiss model, each of the N spins interacts equally strongly with every other spin: H=−

X 1X J σl σl − h σl . 2 l l,l

ˆ

J . This model can be solved exactly; show that it yields the result of Here, J = N molecular ﬁeld theory.

6.14 Magnons (= spin waves) in ferromagnets. The Heisenberg Hamiltonian, which gives a satisfactory description of certain ferromagnets, is given by H=−

1X J (|xl − xl |) Sl Sl , 2 l,l

where l and l are nearest neighbors on a cubic lattice. By applying the Holstein– Primakoﬀ transformation, √ √ Slz = S − nl Sl+ = 2S ϕ (nl ) al , Sl− = 2S a+ l ϕ (nl ) , p ˆ ` ± ´ ˜ Sl = Slx ± iSly with ϕ (nl ) = 1 − nl /2S, nl = a†l al and al , a†l = δll , as well ˜ ˆ as al , al = 0 – the spin operators are transformed into Bose operators. (a) Show that the commutation relations for the spin operators are fulﬁlled. (b) Represent the Heisenberg Hamiltonian up to second order (harmonic approximation) in the Bose operators al by expanding the square roots in the above transformation in a Taylor series. (c) Diagonalize H (by means of a Fourier transformation) and determine the magnon dispersion relations.

6.15 P (a) Show that a magnon lowers the z-component of the total spin operator Sz ≡

l

Slz by .

(b) Calculate the temperature dependence of the magnetization. (c) Show that in a one- and a two-dimensional spin lattice, there can be no ferromagnetic order at ﬁnite temperatures!

Problems for Chapter 6

329

6.16 Assume a Heisenberg model in an external ﬁeld H, ´ 1X ` J l − l Sl Sl − µ · H , 2 l,l gµB X µ=− Sl .

H=−

l

Show that the isothermal susceptibilities χ|| (parallel to H) and χ⊥ (perpendicular to H) are not negative. Suggestions: Include an additional ﬁeld ∆H in the Hamiltonian and take the derivative with respect to this ﬁeld. For χ|| , i.e. ∆H || H, the assertion follows as in Sect. 3.3 for the compressibility. For an arbitrarily oriented ∆H, it is expedient to use the expansion given in Appendix C.

6.17 Denote the speciﬁc heat at constant magnetization by cM , and at constant ﬁeld by cH . Show that relation (6.1.22c) holds for the isothermal and the adiabatic susceptibility. Volume changes of the magnetic material are to be neglected here. 6.18 A paramagnetic material obeys the Curie law M =c

H , T

where c is a constant. Show, keeping in mind T dS = dE − H dM , that dTad =

H c dH cH T

for an adiabatic change (keeping the volume constant). cH is the speciﬁc heat at constant magnetic ﬁeld.

6.19 A paramagnetic substance obeys the Curie law M =

c T

H (c const.) and its internal energy E is given by E = aT (a > 0, const.). (a) What quantity of heat δQ is released on isothermal magnetization if the magnetic ﬁeld is increased from 0 to H1 ? (b) How does the temperature change if the ﬁeld is now reduced adiabatically from H1 to 0? 4

6.20 Prove the relationships between the shape-dependent and the shape-independent speciﬁc heat (6.6.11a), (6.6.11b) and (6.6.11c). 6.21 Polymers in a restricted geometry: Consider a polymer which is in a coneshaped box (as shown). Why does the polymer move towards the larger opening? (no calculation necessary!)

7. Phase Transitions, Scale Invariance, Renormalization Group Theory, and Percolation

This chapter builds upon the results of the two preceding chapters dealing with the ferromagnetic phase transition and the gas-liquid transition. We start with some general considerations on symmetry breaking and phase transitions. Then a variety of phase transitions and critical points are discussed, and analogous behavior is pointed out. Subsequently, we deal in detail with critical behavior and give its phenomenological description in terms of static scaling theory. In the section that follows, we discuss the essential ideas of renormalization group theory on the basis of a simple model, and use it to derive the scaling laws. Finally, we introduce the Ginzburg–Landau theory; it provides an important cornerstone for the various approximation methods in the theory of critical phenomena. The ﬁrst, introductory section of this chapter exhibits the richness and variety of phase-transition phenomena and tries to convey the fascination of this ﬁeld to the reader. It represents a departure from the main thrust of this book, since it oﬀers only phenomenological descriptions without statistical, theoretical treatment. All of these manifold phenomena connected with phase transitions can be described by a single uniﬁed theory, the renormalization group theory, whose theoretical eﬃcacy is so great that it is also fundamental to the quantum ﬁeld theory of elementary particles.

7.1 Phase Transitions and Critical Phenomena 7.1.1 Symmetry Breaking, the Ehrenfest Classiﬁcation The fundamental laws of Nature governing the properties of matter (Maxwell’s electrodynamics, the Schr¨odinger equation of a many-body system) exhibit a number of distinct symmetry properties. They are invariant with respect to spatial and temporal translations, with respect to rotations and inversions. The states which exist in Nature do not, in general, display the full symmetry of the underlying natural principles. A solid is invariant only with respect to the discrete translations and rotations of its point group. Matter can furthermore exist in diﬀerent states of aggregation or phases, which diﬀer in their symmetry and as a result in their thermal, mechanical,

332

7. Phase Transitions, Renormalization Group Theory, and Percolation

and electromagnetic properties. The external conditions (pressure P , temperature T , magnetic ﬁeld H, electric ﬁeld E, . . .) determine in which of the possible phases a chemical substance with particular internal interactions will present itself. If the external forces or the temperature are changed, at particular values of these quantities the system can undergo a transition from one phase to another: a phase transition takes place. The Ehrenfest Classiﬁcation: as is clear from the examples of phase transitions already treated, the free energy (or some other suitable thermodynamic potential) is a non-analytic function of a control parameter at the phase transition. The following classiﬁcation of phase transitions, due to Ehrenfest , is commonly used: a phase transition of n-th order is deﬁned by the property that at least one of the n-th derivatives of its thermodynamic potential is discontinuous, while all the lower derivatives are continuous at the transition. When one of the ﬁrst derivatives shows a discontinuity, we speak of a ﬁrst-order phase transition; when the ﬁrst derivatives vary continuously but the second derivatives exhibit discontinuities or singularities, we speak of a second-order phase transition (or critical point), or of a continuous phase transition. The understanding of the question as to which phases will be adopted by a particular material under particular conditions certainly belongs among the most interesting topics of the physics of condensed matter. Due to the diﬀering properties of diﬀerent phases, this question is also of importance for materials applications. Furthermore, the behavior of matter in the vicinity of phase transitions is also of fundamental interest. Here, we wish to indicate two aspects in particular: why is it that despite the short range of the interactions, one observes long-range correlations of the ﬂuctuations, in the vicinity of a critical point Tc and long-range order below Tc ? And secondly, what is the inﬂuence of the internal symmetry of the order parameter? Fundamental questions of this type are of importance far beyond the ﬁeld of condensedmatter physics. Renormalization group theory was originally developed in the framework of quantum ﬁeld theory. In connection with critical phenomena, it was formulated by Wilson1 in such a way that the underlying structure of nonlinear ﬁeld theories became apparent, and that also allowed systematic and detailed calculations. This decisive breakthrough led not only to an enormous increase in the knowledge and deeper understanding of condensed matter, but also had important repercussions for the quantum-ﬁeld theoretical applications of renormalization group theory in elementary particle physics. ∗

7.1.2 Examples of Phase Transitions and Analogies

We begin by describing the essential features of phase transitions, referring to Chaps. 5 and 6, where the analogy and the common features between 1

K. G. Wilson, Phys. Rev. B 4, 3174, 3184 (1971)

7.1 Phase Transitions and Critical Phenomena

333

Fig. 7.1a,b. Phase diagrams of (a) a liquid (P -T ) and (b) a ferromagnet (H-T ). (Triple point = T.P., critical point = C.P.)

the liquid-gas transition and the ferromagnetic phase transition were already mentioned, and here we take up their analysis. In Fig. 7.1a,b, the phase diagrams of a liquid and a ferromagnet are shown. The two ferromagnetic ordering possibilities for an Ising ferromagnet (spin “up” and spin “down”) correspond to the liquid and the gaseous phases. The critical point corresponds to the Curie temperature. As a result of the symmetry of the Hamiltonian for H = 0 with respect to the operation σl → −σl for all l, the phase boundary is situated symmetrically in the H-T plane. Ferromagnetic order is characterized by the order parameter m at H = 0. It is zero above Tc and ± m0 below Tc , as shown in the M -T diagram in Fig. 7.1d. The corresponding quantity for the liquid can be seen in the V -T diagram of Fig. 7.1c. Here, the order parameter is (ρL − ρc ) or (ρG − ρc ). In everyday life, we usually observe the liquid-gas transition at constant pressure far below Pc . On heating, the density changes discontinuously as a function of the temperature. Therefore, the vaporization transition is usually considered to be a ﬁrst-order phase transition and the critical point is the end point of the vaporization curve, at which the diﬀerence between the gas and the liquid ceases to exist. The analogy between the gas-liquid and the ferromagnetic transitions becomes clearer if one investigates the liquid in a so called Natterer tube2 . This is a sealed tube in which the substance thus has a ﬁxed, given density. If one chooses the amount of material so that the density is equal to the critical density ρc , then above Tc there is a ﬂuid phase, while on cooling, this phase splits up into a denser liquid phase separated from the less dense gas phase by a meniscus. This corresponds to cooling a ferromagnet 2

See the reference in Sect. 3.8.

334

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.1c,d. The order parameter for (c) the gas-liquid transition (below, two Natterer tubes are illustrated), and for (d) the ferromagnetic transition

at H = 0. Above Tc , the disordered paramagnetic state is present, while below it, the sample splits up into (at least two) negatively and positively oriented ferromagnetic phases.3 Fig. 7.1e,f shows the isotherms in the P -V and M -H diagrams. The similarity of the isotherms becomes clear if the second picture is rotated by 90◦ . In ferromagnets, the particular symmetry again expresses itself. Since the phase boundary curve in the P -T diagram of the liquid is slanted, the horizontal sections of the isotherms in the P -V diagram are not congruent. Finally, Fig. 7.1g,h illustrates the surface of the equation of state. The behavior in the immediate vicinity of a critical point is characterized by power laws with critical exponents which are summarized for ferromagc nets and liquids in Table 7.1. As in Chaps. 5 and 6, τ = T −T Tc . The critical exponents β, γ, δ, α for the order parameter, the susceptibility, the critical isotherm, and the speciﬁc heat are the goal of theory and experiment. Additional analogies will be seen later in connection with the correlation functions and the scattering phenomena which follow from them. 3

In Ising systems there are two magnetization directions; in Heisenberg systems without an applied ﬁeld, the magnetization can be oriented in any arbitrary direction, since the Hamiltonian (6.5.2) is rotationally invariant.

7.1 Phase Transitions and Critical Phenomena

335

Fig. 7.1e,f. The isotherms (e) in the P -V and (f ) in the M -H diagram

Fig. 7.1g,h. The surface of the equation of state for a liquid (g) and for a ferromagnet (h) The general deﬁnition of the value of a critical exponent of a function f (T − Tc ), which is not a priori a pure power law is given by exponent = lim

T →Tc

d log f (T − Tc ) . d log(T − Tc )

(7.1.1)

When f has the form f = a + (T − Tc ), one ﬁnds: d log(a + T − Tc ) 1 = · d log(T − Tc ) a + (T − Tc )

1 d log (T −Tc ) d (T −Tc )

=

T − Tc −→ 0 . a + (T − Tc )

When f is logarithmically divergent, the following expression holds: d log log (T − Tc ) 1 = −→ 0 . d log(T − Tc ) log(T − Tc )

336

7. Phase Transitions, Renormalization Group Theory, and Percolation

In these two cases, the value of the critical exponent is zero. The ﬁrst case occurs for the speciﬁc heat in the molecular ﬁeld approximation, the second for the speciﬁc heat of the two-dimensional Ising model. The reason for introducing critical exponents even for such cases can be seen from the scaling laws which will be treated in the next section. To distinguish between the diﬀerent meanings of the exponent zero (discontinuity and logarithm), one can write 0d and 0log . Table 7.1. Ferromagnets and Liquids: Critical Exponents Ferromagnet

Liquid

Critical behavior

Order parameter

M

(VG,L − Vc ) or (ρG,L − ρc )

(−τ )β

T < Tc

Isothermal susceptibility

Magnetic susceptibility ` ´ χT = ∂M ∂H T

Isothermal compressibility ` ´ κT = − V1 ∂V ∂P T

∝ |τ |−γ

T ≷ Tc

Critical isotherm (T = Tc )

H = H(M )

P = P (V − Vc )

∼ Mδ ∼ (V − Vc )δ

T = Tc

Speciﬁc heat

CM =0 ` ∂S=´CH=0 = T ∂T H

CV = T

∝ |τ |−α

T ≷ Tc

` ∂S ´ ∂T

V

We want to list just a few examples from among the multitude of phase transitions4 . In the area of magnetic substances, one ﬁnds antiferromagnets (e.g. with two sublattices having opposite directions of magnetization M1 and M2 ), ferrimagnets, and helical phases. In an antiferromagnet with two sublattices, the order parameter is N = M1 − M2 , the so called staggered magnetization. In binary liquid mixtures, there are separation transitions, where the order parameter characterizes the concentration. In the case of structural phase transitions, the lattice structure changes at the transition, and the order parameter is given by the displacement ﬁeld or the strain tensor. Examples are ferroelectrics5 and distortive transitions, where the order parameter is given by e.g. the electric polarization P or the rotation angle ϕ of a molecular group. Finally, there are transitions into macroscopic quantum states, i.e. superﬂuidity and superconductivity. Here, the order parameter is a complex ﬁeld ψ, the macroscopic wavefunction, and the broken symmetry is the gauge invariance with respect to the phase of ψ. In the liquid-solid transition, the translational symmetry is broken and the order parameter is 4

5

We mention two review articles in which the literature up to 1966 is summarized: M. E. Fisher, The Theory of Equilibrium Critical Phenomena, p. 615; and P. Heller, Experimental Investigations of Critical Phenomena, p. 731, both in Reports on Progress in Physics XXX (1967). In a number of structural phase transitions, the order parameter jumps discontinuously to a ﬁnite value at the transition temperature. In this case, according to Ehrenfest’s classiﬁcation, we are dealing with a ﬁrst-order phase transition.

7.1 Phase Transitions and Critical Phenomena

337

a component of the Fourier-transformed density. This transition line does not end in a critical point. Table 7.2 lists the order parameter and an example of a typical substance for some of these phase transitions. Table 7.2. Phase transitions (critical points), order parameters, and substances Phase transition

Order parameter

Substance

Paramagnet–ferromagnet (Curie temperature)

Magnetization M

Fe

Paramagnet–antiferromagnet (N´eel temperature)

staggered N = M1 − M2 magnetization

RbMnF3

Gas-liquid (Critical point)

Density

Separation of binary liquid mixtures

Concentration c − cc

Order–disorder transitions

Sublattice occupation

NA − NB

Cu-Zn

Paraelectric–ferroelectric

Polarization

P

BaTiO3

Distortive structural transitions

Rotation angle ϕ

SrTiO3

Elastic phase transitions

Strain

He I–He II (Lambda point)

ρ − ρc

Methanoln-Hexane

Bose condensate

Normal conductor–superconductor Cooper-pair amplitude

CO2

KCN 4

Ψ

∆

He

Nb3 Sn

In general, the order parameter is understood to be a quantity which is zero above the critical point and ﬁnite below it, and which characterizes the structural or other changes which occur in the transition, such as the expectation value of lattice displacements or a component of the total magnetic moment. To clarify some concepts, we discuss at this point a generalized anisotropic, ferromagnetic Heisenberg model : 3 1 2 H=− J (l − l )σlz σlz + J⊥ (l − l )(σlx σlx + σly σly ) −h σlz , (7.1.2) 2 l,l

(σlx , σly , σlz )

l

is the three-dimensional Pauli spin operator at lattice where σ l = site xl and N is the number of lattice sites. This Hamiltonian contains the uniaxial ferromagnet for J (l − l ) > J⊥ (l − l ) ≥ 0, and for J⊥ (l − l ) = J (l − l ), it describes the isotropic Heisenberg model (6.5.2). In the former case, the

338

7. Phase Transitions, Renormalization Group Theory, and Percolation

order parameter referred to the number of lattice sites (h = 0) is the single component quantity N1 l σlz , i.e. the number of components n is n = 1. In the latter case, the order parameter is N1 l σ l , which can point in any arbitrary direction (h = 0 !); here, the number of components is n = 3. For J⊥ (l − l ) > J (l − l ) ≥ 0, we ﬁnd the so called planar ferromagnet , in which the order parameter N1 l (σlx , σly , 0) has two components, n = 2. A special case of the uniaxial ferromagnet is the Ising model (6.5.4), with J⊥ (l−l ) = 0. The uniaxial ferromagnet has the following symmetry elements: all rotations around the z-axis, the discrete symmetry (σlx , σly , σlz ) → (σlx , σly , −σlz ) and products thereof. Below Tc , the invariance with respect to this discrete symmetry is broken. In the planar ferromagnet, the (continuous) rotational symmetry around the z-axis, and in the case of the isotropic Heisenberg model, the O(3) symmetry – i.e. the rotational invariance around an arbitrary axis – is broken. One couldask why e.g. for the the Ising Hamiltonian without an external ﬁeld, N1 σl can ever be nonzero, since from the invariance operation {σlz } → {−σlz }, it follows that N1 l σlz = − N1 l σlz . In a ﬁnite system, 1 σlz h is analytic in h for ﬁnite h, and N 1 z σl =0. (7.1.3) lim h→0 N l

h

For ﬁnite N , conﬁgurations with spins oriented opposite to the ﬁeld also contribute to the partition function, and their weight increases with decreasing values of h. The mathematically precise deﬁnition of the order parameter is: 1 z σ = lim lim σl ; (7.1.4) h→0 N →∞ N l

h

ﬁrst, the thermodynamic limit N → ∞ is taken, and then h → 0. This quantity can be nonzero below Tc . For N → ∞, states with the ‘wrong’ orientation have vanishing weights in the partition function for arbitrarily small but ﬁnite ﬁelds. 7.1.3 Universality In the vicinity of critical points, the topology of the phase diagrams of such diverse systems as a gas-liquid mixture and a ferromagnetic material are astonishingly similar; see Fig. 7.1. Furthermore, experiments and computer simulations show that the critical exponents for the corresponding phase transitions for broad classes of physical systems are the same and depend only on the number of components and the symmetry of the order parameter, the spatial dimension and the character of the interactions, i.e. whether short-ranged, or long-ranged (e.g. Coulomb, dipolar forces). This remarkable feature is termed universality. The microscopic details of these strongly interacting many-body

7.2 The Static Scaling Hypothesis

339

systems express themselves only in the prefactors (amplitudes) of the power laws, and even the ratios of these amplitudes are universal numbers. The reason for this remarkable result lies in the divergence of the cor −ν c relation length ξ = ξ0 T −T . On approaching Tc , ξ becomes the only Tc relevant length scale of the system, which at long distances dominates all of the microscopic scales. Although the phase transition is caused as a rule by short-range interactions of the microscopic constituents, due to the longrange ﬂuctuations (see 6.12), the dependence on the microscopic details such as the lattice structure, the lattice constant, or the range of the interactions (as long as they are short-ranged) is secondary. In the critical region, the system behaves collectively, and only global features such as its spatial dimension and its symmetry play a role; this makes the universal behavior understandable. The universality of critical phenomena is not limited to materials classes, but instead it extends beyond them. For example, the static critical behavior of the gas-liquid transition is the same as that of Ising ferromagnets. Planar ferromagnets behave just like 4 He at the lambda point. Even without making use of renormalization group theory, these relationships can be understood with the aid of the following transformations6 : the grand partition function of a gas can be approximately mapped onto that of a lattice gas which is equivalent to a magnetic Ising model (occupied/unoccupied cells = 4 spin up/down). The Hamiltonian of a Bose liquid can be mapped onto that of a planar ferromagnet. The gauge invariance of the Bose Hamiltonian corresponds to the two-dimensional rotational invariance of the planar ferromagnet.

7.2 The Static Scaling Hypothesis7 7.2.1 Thermodynamic Quantities and Critical Exponents In this section, we discuss the analytic structure of the thermodynamic quantities in the vicinity of the critical point and draw from it typical conclusions about the critical exponents. This generally-applicable procedure will be demonstrated using the terminology of ferromagnetism. In the neighborhood of Tc , the equation of state according to Eq. (6.5.16) takes on the form 6 7

See e.g. M. E. Fisher, op. cit., and problem 7.16. Although the so called scaling theory of critical phenomena can be derived microscopically through renormalization group theory (see Sect. 7.3.4), it is expedient for the following reasons to ﬁrst introduce it on a phenomenological basis: (i) as a motivation for the procedures of renormalization group theory; (ii) as an illustration of the structure of scaling considerations for physical situations where ﬁeld-theoretical treatments based on renormalization group theory are not yet available (e.g. for many nonequilibrium phenomena). Scaling treatments, starting from critical phenomena and high-energy scaling in elementary particle physics, have acquired a great inﬂuence in the most diverse ﬁelds.

340

7. Phase Transitions, Renormalization Group Theory, and Percolation

h 1 = τ m + m3 kTc 3

(7.2.1)

3 1 m which can be rearranged as follows: kT1 c |τ |h3/2 = sgn(τ ) |τ m . 1/2 + 3 1/2 | |τ | Solving for m, we obtain the following dependence of m on τ and h: ) * h 1/2 m(τ, h) = |τ | m± for T ≷ Tc . (7.2.2) 3/2 |τ | The functions m± for T ≷ Tc are determined by (7.2.1). In the vicinity of the critical point, the magnetization depends on τ and h in a very special 1/2 3/2 way: apart from the factor |τ | , it depends only on the ratio h/|τ | . The magnetization is a generalized homogeneous function of τ and h. This implies that (7.2.2) is invariant with respect to the scale transformation h → hb3 , τ → τ b2 ,

and m → mb .

This scaling invariance of the physical properties expresses itself for example in the speciﬁc heat of 4 He at the lambda point (Fig. 7.2). We know from Chap. 6 and Table 7.1 that the real critical exponents diﬀer from their molecular-ﬁeld values in (7.2.2). It is therefore reasonable to extend the equation of state (7.2.2) to arbitrary critical exponents8 : ) * h β m(τ, h) = |τ | m± ; (7.2.3) δβ |τ | in this expression, β and δ are critical exponents and the m± are called scaling functions. At the present stage, (7.2.3) remains a hypothesis; it is, however, possible to prove this hypothesis using renormalization group theory, as we shall demonstrate later in Sect. 7.3, for example in Eq. (7.3.40). For the present, we take (7.2.3) as given and ask what its general consequences are. The two scaling functions m± (y) must fulﬁll certain boundary conditions which follow from the critical properties listed in Eq. (6.5.31) and Table 7.1. The magnetization is always oriented parallel to h when the applied ﬁeld is nonzero and remains ﬁnite in the limit h → 0 below Tc , while above Tc , it goes to zero: 8

In addition to being a natural generalization of molecular ﬁeld theory, one can understand the scaling hypothesis (7.2.3) by starting with the fact that singularities are present only for τ = 0 and h = 0. How strong the eﬀects of the singularities will be depends on the distance from the critical point, τ , and on h/|τ |βδ , i.e. the ratio between the applied ﬁeld and the ﬁeld-equivalent of τ ; that is hτ = mδ = |τ |βδ . As long as h hτ , the system is eﬀectively in the lowﬁeld limit and m ≈ |τ |β m± (0). On the other hand, if τ becomes so small that |τ | ≤ h1/βδ , then the inﬂuence of the applied ﬁeld predominates. Any additional reduction of τ produces no further change: m remains at the value which it had 1 for |τ | = h1/βδ , i.e. h δ m± (1). In the limit τ → 0, m± (y) −→ y β must hold, so that the singular dependence on τ in m(τ, h) cancels out.

7.2 The Static Scaling Hypothesis

341

Fig. 7.2. The speciﬁc heat at constant pressure, cP , at the lambda transition of 4 He. The shape of the speciﬁc heat stays the same on changing the temperature scale (1 K to 10−6 K)

lim m− (y) = sgn y ,

y→0

m+ (0) = 0 .

(7.2.4a)

The thermodynamic functions are non-analytic precisely at τ = 0, h = 0. For nonzero h, the magnetization is ﬁnite over the whole range of temperatures β and remains an analytic function of τ even for τ = 0; the |τ | dependence δβ of (7.2.3) must be compensated by the function m± (h/|τ | ). Therefore, the two functions m± must behave as lim m± (y) ∝ y 1/δ

(7.2.4b)

y→∞

for large arguments. It follows from this that for τ = 0, i.e. at the critical point, m ∼ h1/δ . The scaling functions m± (y) are plotted in Fig. 7.3. Eq. (7.2.3), like the molecular-ﬁeld version of the scaling law given above, requires that the magnetization must be a generalized homogeneous function of τ and h and is therefore invariant with respect to scale transformations: h → hb

βδ ν

1

, τ → τb ν ,

and m → mbβ/ν .

The name scaling law is derived from this scale invariance. Equation (7.2.3) contains additional information about the thermodynamics; by integration,

342

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.3. The qualitative behavior of the scaling functions m±

we can determine the free energy and by taking suitable derivatives we can ﬁnd the magnetic susceptibility and the speciﬁc heat. From these we obtain relations between the critical exponents. For the susceptibility, we ﬁnd the scaling law from Eq. (7.2.3):

∂m h β−δβ χ≡ , (7.2.5) = |τ | m± δβ ∂h T |τ | β−δβ

and in the limit h → 0, we thus have χ ∝ |τ | . It then follows that the critical exponent of the susceptibility, γ (Eq. (6.5.31c)), is given by γ = −β(1 − δ) .

(7.2.6)

The speciﬁc free energy is found through integration of (7.2.3): h/|τ |δβ h β+δβ f − f0 = − dh m(τ, h) = −|τ | dx m± (x) . h0

h0 /|τ |δβ

Here, h0 must be suﬃciently large so that the starting point for the integration lies outside the critical region. The free energy then takes on the following form:

h β+δβ ˆ f (τ, h) = |τ | + freg . f± (7.2.7) βδ |τ | In this expression, fˆ is deﬁned by the value resulting from the upper limit of the integral and freg is the non-singular part of the free energy. The speciﬁc heat at constant magnetic ﬁeld is obtained by taking the second derivative of (7.2.7), ch = −

∂ 2f β(1+δ)−2 ∼ A± |τ | + B± . ∂τ 2

(7.2.8)

The A± in this expression are amplitudes and the B± come from the regular part. Comparison with the behavior of the speciﬁc heat as characterized by the critical exponent α (Eq. (6.5.31d)) yields

7.2 The Static Scaling Hypothesis

α = 2 − β(1 + δ) .

343

(7.2.9)

The relations between the critical exponents are termed scaling relations, since they follow from the scaling laws for the thermodynamic quantities. If we add (7.2.6) and (7.2.9), we obtain γ + 2β = 2 − α .

(7.2.10)

From (7.2.6) and (7.2.9), one can see that the remaining thermodynamic critical exponents are determined by β and δ. 7.2.2 The Scaling Hypothesis for the Correlation Function In the molecular ﬁeld approximation, we obtained the Ornstein–Zernike behavior in Eqns. (6.5.50) and (6.5.53 ) for the wavevector-dependent susceptibility χ(q) and the correlation function G(x): 2

χ(q) =

(qξ) 1 , ˜ 2 1 + (qξ)2 Jq

G(x) =

kTc v e−|x|/ξ 4π J˜ |x|

with ξ = ξ0 τ − 2 . (7.2.11) 1

The generalization of this law is (q a−1 , |x| a, ξ a with the lattice constant a): χ(q) =

1 χ ˆ qξ , q 2−η

G(x) =

1 1+η

|x|

ˆ |x|/ξ , G

ξ = ξ0 τ −ν , (7.2.12a,b,c)

ˆ where the functions χ(qξ) ˆ and G(|x|/ξ) are still to be determined. In (7.2.12c), we assumed that the correlation length ξ diverges at the critical point. This divergence is characterized by the critical exponent ν. Just at Tc , ξ = ∞ and therefore there is no longer any ﬁnite characteristic length; the correlation function G(x) can thus only fall oﬀ according to a power law ˆ G(x) ∼ |x|11+η G(0). The possibility of deviations from the 1/|x|-behavior of the Ornstein–Zernike theory was taken into account by introducing the additional critical exponent η. In the immediate vicinity of Tc , ξ is the only relevant length and therefore the correlation function also contains the factor ˆ G(|x|/ξ). Fourier transformation of G(x) yields (7.2.12a) for the wavevectordependent susceptibility, which for its part represents an evident generalization of the Ornstein–Zernike expression. We recall (from Sects. 5.4.4 and 6.5.5.2) that the increase of χ(q) for small q on approaching Tc leads to critical opalescence. In (7.2.11) and (7.2.12b), a three-dimensional system was assumed. Phase transitions are of course highly interesting also in two dimensions, and furthermore it has proved fruitful in the theory of phase transitions to consider arbitrary dimensions (even non-integral dimensions). We therefore generalize the relations to arbitrary dimensions d:

344

7. Phase Transitions, Renormalization Group Theory, and Percolation

G(x) =

1 d−2+η

|x|

ˆ |x|/ξ . G

(7.2.12b )

Equations (7.2.12a) and (7.2.12c) remain valid also in d dimensions, whereby ˆ and χ of course the exponents ν and η and the form of the functions G ˆ depend on the spatial dimension. From (7.2.12a) and (7.2.12b ) at the critical point we obtain G(x) ∝

1 d−2+η

|x|

and χ ∝

1

for

q 2−η

T = Tc .

(7.2.13)

ˆ Here, we have assumed that G(0) and χ(∞) ˆ are ﬁnite, which follows from the ﬁnite values of G(x) at ﬁnite distances and of χ(q) at ﬁnite wavenumbers (and ξ = ∞). We now consider the limiting case q → 0 for temperatures T = Tc . Then we ﬁnd from (7.2.12a) 2−η

χ = lim χ(q) ∝ q→0

(qξ) q 2−η

= ξ 2−η .

(7.2.14)

This dependence is obtained on the basis of the following arguments: for ﬁnite ξ, the susceptibility remains ﬁnite even in the limit q → 0. Therefore, 1 the factor q2−η in (7.2.12a) must be compensated by a corresponding dependence of χ(qξ), ˆ from which the relation (7.2.14) follows for the homogeneous susceptibility. Since its divergence is characterized by the critical exponent γ according to (6.5.31c), it follows from (7.2.14) together with (7.2.12c) that there is an additional scaling relation γ = ν(2 − η) .

(7.2.15)

Relations of the type (7.2.3), (7.2.7), and (7.2.12b ) are called scaling laws, since they are invariant under the following scale transformations: x → x/b,

ξ → ξ/b,

m → mbβ/ν ,

τ → τ b1/ν ,

fs → fs b(2−α)/ν ,

h → hbβδ/ν

G → Gb(d−2+η)/ν ,

(7.2.16)

where fs stands for the singular part of the (speciﬁc) free energy. If we in addition assume that these scale transformations are based on a microscopic elimination procedure by means of which the original system with lattice constant a and N lattice sites is mapped onto a new system with the same lattice constant a but a reduced number N b−d of degrees of freedom, then we ﬁnd Fs (τ, h) Fs (τ b1/ν , hbβδ/ν ) , = b−d N N b−d which implies the hyperscaling relation

(7.2.17)

7.3 The Renormalization Group

2 − α = dν ,

345

(7.2.18)

which also contains the spatial dimension d. According to equations (7.2.6), (7.2.9), (7.2.15), and (7.2.18), all the critical exponents are determined by two independent ones. For the two-dimensional Ising model one ﬁnds the exponents of the correlation function, ν = 1 and η = 1/4, from the exponents quoted following Eq. (6.5.31d) and the scaling relations (7.2.15) and (7.2.18).

7.3 The Renormalization Group 7.3.1 Introductory Remarks The term ‘renormalization’ of a theory refers to a certain reparametrization with the goal of making the renormalized theory more easily dealt with than the original version. Historically, renormalization was developed by St¨ uckelberg and Feynman in order to remove the divergences from quantumﬁeld theories such as quantum electrodynamics. Instead of the bare parameters (masses, coupling constants), the Lagrange function is expressed in terms of physical masses and coupling coeﬃcients, so that ultraviolet divergences due to virtual transitions occur only within the connection between the bare and the physical quantities, leaving the renormalized theory ﬁnite. The renormalization procedure is not unique; the renormalized quantities can for example depend upon a cutoﬀ length scale, up to which certain virtual processes are taken into account. Renormalization group theory studies the dependence on this length scale, which is also called the “ﬂow parameter”. The name “renormalization group” comes from the fact that two consecutive renormalization group transformations lead to a third such transformation. In the ﬁeld of critical phenomena, where one must explain the observed behavior at large distances (or in Fourier space at small wavenumbers), it is reasonable to carry out the renormalization procedure by a suitable elimination of the short-wavelength ﬂuctuations. A partial evaluation of the partition function in this manner is easier to carry out than the calculation of the complete partition function, and can be done using approximation methods. As a result of the elimination step, the remaining degrees of freedom are subject to modiﬁed, eﬀective interactions. Quite generally, one can expect the following advantages from such a renormalization group transformation: (i) The new coupling constants could be smaller. By repeated applications of the renormalization procedure, one could thus ﬁnally obtain a practically free theory, without interactions. (ii) The successively iterated coupling coeﬃcients, also called “parameter ﬂow”, could have a ﬁxed point, at which the system no longer changes

346

7. Phase Transitions, Renormalization Group Theory, and Percolation

under additional renormalization group transformations. Since the elimination of degrees of freedom is accompanied by a change of the underlying lattice spacing, or length scale, one can anticipate that the ﬁxed points are under certain circumstances related to critical points. Furthermore, it can be hoped that the ﬂow in the vicinity of these ﬁxed points can yield information about the universal physical quantities in the neighborhood of the critical points. The scenario described under (i) will in fact be found for the one-dimensional Ising model, and that described under (ii) for the two-dimensional Ising model. The renormalization group method brings to bear the scale invariance in the neighborhood of a critical point. In the case of so called real-space transformations (in contrast to transformation in Fourier space), one eliminates certain degrees of freedom which are deﬁned on a lattice, and thus carries out a partial trace operation on the partition function. The lattice constant of the resulting system is then readjusted and the internal variables are renormalized in such a manner that the new Hamiltonian corresponds to the original one in its form. By comparison, one deﬁnes eﬀective, scale-independent coupling constants, whose ﬂow behavior is then investigated. We ﬁrst study the one-dimensional Ising model and then the two-dimensional. Finally, the general structure of such transformations will be discussed with the derivation of scaling laws. A brief schematic treatment of continuous ﬁeld-theoretical formulations will be undertaken following the Ginzburg–Landau theory. 7.3.2 The One-Dimensional Ising Model, Decimation Transformation We will ﬁrst illustrate the renormalization group method using the onedimensional Ising model, with the ferromagnetic exchange constant J in zero applied ﬁeld, as an example. The Hamiltonian is H = −J σl σl+1 , (7.3.1) l

where l runs over all the sites in the one-dimensional chain; see Fig. 7.4. We introduce the abbreviation K = J/kT into the partition function for N spins with periodic boundary conditions σN +1 = σ1 , P eK l σl σl+1 . (7.3.2) ZN = Tr e−H/kT = {σl =±1}

The decimation procedure consists in partially evaluating the partition function, by carrying out the sum over every second spin in the ﬁrst step. In Fig. 7.4, the lattice sites for which the trace is taken are marked with a cross.

7.3 The Renormalization Group

347

Fig. 7.4. An Ising chain; the trace is carried out over all the lattice points which are marked with a cross. The result is a lattice with its lattice constant doubled

A typical term in the partition function is then eKσl (σl−1 +σl+1 ) = 2 cosh K(σl−1 + σl+1 ) = e2g+K σl−1 σl+1 , (7.3.3) σl =±1

with coeﬃcients g and K which are still to be determined. Here, we have taken the sum over σl = ±1 after the ﬁrst equals sign. Since cosh K(σl−1 + σl+1 ) depends only on whether σl−1 and σl+1 are parallel or antiparallel, the result can in any case be brought into the form given after the second equals sign. The coeﬃcients g and K can be determined either by expansion of the exponential function or, still more simply, by comparing the two expressions for the possible orientations. If σl−1 = −σl+1 , we ﬁnd

2 = e2g−K ,

(7.3.4a)

and if σl−1 = σl+1 , the result is

2 cosh 2K = e2g+K .

(7.3.4b)

From the product of (7.3.4a) and (7.3.4b) we obtain 4 cosh 2K = e4g , and from the quotient, cosh 2K = e2K ; thus the recursion relations are: 1 log cosh 2K 2 1 g = log 2 + K . 2

K =

(7.3.5a) (7.3.5b)

Repeating this decimation procedure a total of k times, we obtain from (7.3.5a,b) for the kth step the following recursion relation: 1 K (k) = log cosh 2K (k−1) (7.3.6a) 2 1 1 g(K (k) ) = log 2 + K (k) . (7.3.6b) 2 2 The decimation produces another Ising model with an interaction between nearest neighbors having a coupling constant K (k) . Furthermore, a spinindependent contribution g(K (k) ) to the energy is generated; in the kth step, it is given by (7.3.6b). In a transformation of this type, it is expedient to determine the ﬁxed points which in the present context will prove to be physically relevant. Fixed

348

7. Phase Transitions, Renormalization Group Theory, and Percolation

points are those points K ∗ which are invariant with respect to the transformation, i.e. here K ∗ = 12 log(cosh 2K ∗ ). This equation has two solutions, K∗ = 0

(T = ∞) and K ∗ = ∞

(T = 0) .

(7.3.7)

The recursion relation (7.3.6a) is plotted in Fig. 7.5. Starting with the initial value K0 , one obtains K (K0 ), and by a reﬂection in the line K = K, K (K (K0 )), and so forth. One can see that the coupling constant decreases continually; the system moves towards the ﬁxed point K ∗ = 0, i.e. a noninteracting system. Therefore, for a ﬁnite K0 , we never arrive at an ordered state: there is no phase transition. Only for K = ∞, i.e. for a ﬁnite exchange interaction J and T = 0, do the spins order.

Fig. 7.5. The recursion relation for the onedimensional Ising model with interactions between nearest neighbors (heavy solid curve), the line K = K (dashed), and the iteration steps (thin lines with arrows)

Making use of this renormalization group (RG) transformation, we can calculate the partition function and the free energy. The partition function for all together N spins with the coupling constant K, using (7.3.3), is

ZN (K) = eN g(K ) Z N (K ) = eN g(K 2

and, after the nth step, n ZN (K) = exp N k=1

)+ N 2 g(K )

Z N2 (K ) ,

(7.3.8)

2

1 (k) + log Z Nn K (n) . g K 2 2k−1

(7.3.9)

The reduced free energy per lattice site and kT is deﬁned by 1 f˜ = − log ZN (K) . N

(7.3.10)

7.3 The Renormalization Group

349

As we have seen, the interactions become weaker as a result of the renormalization group transformation, which gives rise to the following possible application: after several steps the interactions have become so weak that perturbation-theory methods can be used, or the interaction can be altogether neglected. Setting K (n) ≈ 0, from (7.3.9) we obtain the approximation: f˜(n) (K) = −

n k=1

1 2k−1

1 g K (k) − n log 2 , 2

(7.3.11)

since the free energy per spin of a ﬁeld-free spin-1/2 system without interactions is − log 2. Fig. 7.6 shows f˜(n) (K) for n = 1 to 5. We can see how quickly this approximate solution approaches the exact reduced free energy f˜(K) = − log(2 cosh K). The one-dimensional Ising model can be exactly solved by elementary methods (see problem 6.9), as well as by using the transfer matrix method, cf. Appendix F.

Fig. 7.6. The reduced free energy of the onedimensional Ising model. f˜ is the exact free energy, f˜(1) , f˜(2) , . . . are the approximations (7.3.11)

7.3.3 The Two-Dimensional Ising Model The application of the decimation procedure to the two-dimensional Ising model is still more interesting, since this model exhibits a phase transition at a ﬁnite temperature Tc > 0. We consider the square lattice rotated by 45◦ which is illustrated in Fig. 7.7, with a lattice constant of one. The Hamiltonian multiplied by β, H = βH, is H =− Kσi σj , (7.3.12) n.n.

350

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.7. A square spin lattice, rotated by 45◦ . The lattice sites are indicated by points. In the decimation transformation, the spins at the sites which are also marked by a cross are eliminated. K is the interaction between nearest neighbors and L is the interaction between next-nearest neighbors

where the sum runs over all pairs of nearest neighbors (n.n.) and K = J/kT . When in the partial evaluation of the partition function the trace is taken over the spins marked by crosses, we obtain a new square lattice of lattice √ constant 2. How do the coupling constants transform? We pick out one of the spins with a cross, σ, denote its neighbors as σ1 , σ2 , σ3 , and σ4 , and evaluate their contribution to the partition function: eK(σ1 +σ2 +σ3 +σ4 )σ = elog(2 cosh K(σ1 +σ2 +σ3 +σ4 )) σ=±1 (7.3.13) A + 12 K (σ1 σ2 ...+σ3 σ4 )+L (σ1 σ3 +σ2 σ4 )+M σ1 σ2 σ3 σ4 =e . This transformation (taking a partial trace) yields a modiﬁed interaction between nearest neighbors, K (here, the elimination of two crossed spins contributes); in addition, new interactions between the next-nearest neighbors (such as σ1 and σ3 ) and a four-spin interaction are generated: H = A + K σi σj + L σi σj + . . . . (7.3.12 ) ”u.n.N.

n.N.

The coeﬃcients A , K , L and M can readily be found from (7.3.13) as functions of K, by using σi 2 = 1, i = 1, . . . , 4 (see problem 7.2): 3 12 log cosh 4K + 4 log cosh 2K , 8 1 1 K (K) = log cosh 4K , L (K) = K (K) 4 2 2 3 1 log cosh 4K − 4 log cosh 2K . M (K) = 8 A (K) = log 2 +

(7.3.14) (7.3.13 )

Putting the critical value Kc = J/kTc = 0.4406 (exact result9 ) into this relation as an estimate for the initial value K, we ﬁnd M L ≤ K . In 9

The partition function of the Ising model on a square lattice without an external ﬁeld was evaluated exactly by L. Onsager, Phys. Rev. 65, 117 (1944), using the transfer matrix method (see Appendix F.).

7.3 The Renormalization Group

351

the ﬁrst elimination step, the original Ising model is transformed into one with three interactions; in the next step we must take these into account and obtain still more interactions, and so on. In a quantitatively usable calculation it will thus be necessary to determine the recursion relations for an extended number of coupling constants. Here, we wish only to determine the essential structure of such recursion relations and to simplify them suﬃciently so that an analytic solution can be found. Therefore, we neglect the coupling constant M and all the others which are generated by the elimination procedure, and restrict ourselves to K and L as well as their initial values K and L. This is suggested by the smallness of M which we mentioned above. We now require the recursion relation including the coupling constant L, which acts between σ1 and σ4 , etc. Thus, expanding (7.3.13 ) up to second order in K and taking note of the fact that an interaction L between nextnearest neighbors in the original Hamiltonian appears as a contribution to the interactions of the nearest neighbors in the primed Hamiltonian, we ﬁnd the following recursion relations on elimination of the crossed spins (Fig. 7.7): K = 2K 2 + L

2

L =K .

(7.3.15a) (7.3.15b)

These relations can be arrived at intuitively as follows: the spin σ mediates an interaction of the order of K times K, i.e. K 2 between σ1 and σ3 , likewise the crossed spin just to the left of σ. This leads to 2K 2 in K . The interaction L between next-nearest neighbors in the original model makes a direct contribution to K . Spin σ also mediates a diagonal interaction between σ1 and σ4 , leading thus to the relation L = K 2 in (7.3.15b). However, it should be clear that in contrast to the one-dimensional case, new coupling constants are generated in every elimination step. One cannot expect that these recursion relations, which have been restricted as an approximation to a reduced parameter space (K, L), will yield quantitatively accurate results. They do contain all the typical features of this type of recursion relations. In Fig. 7.8, we have shown the recursion relations (7.3.15a,b)10. Starting from values (K, 0), the recursion relation is repeatedly applied, likewise for initial values (0, L). The following picture emerges: for small initial values, the ﬂux lines converge to K = L = 0, and for large initial values they converge to K = L = ∞. These two regions are separated by two lines, which meet at Kc∗ = 13 and L∗c = 19 . Further on it will become clear that this ﬁxed point is connected to the critical point. We now want to investigate analytically the more important properties of the ﬂow diagram which follows from the recursion relations (7.3.15a,b). As a 10

For clarity we have drawn in only every other iteration step in Fig. 7.8. We will return to this point at the end of this section, after investigating the analytic behavior of the recursion relation.

352

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.8. A ﬂow diagram of Eq. (7.3.15a,b) (only every other point is indicated.) Three ﬁxed points can be recognized: K ∗ = L∗ = 0, K ∗ = L∗ = ∞ and Kc∗ = 13 , L∗c = 19

ﬁrst step, the ﬁxed points must be determined from (7.3.15a,b), i.e. K ∗ and L∗ , which obey K ∗ = 2K ∗ 2 + L∗ and L∗ = K ∗ . These conditions give three ﬁxed points (i)

K ∗ = L∗ = 0,

(ii) K ∗ = L∗ = ∞,

and (iii)

Kc∗ =

1 1 , L∗c = . 3 9 (7.3.16)

The high-temperature ﬁxed point (i) corresponds to a temperature T = ∞ (disordered phase), while the low-temperature ﬁxed point (ii) corresponds to T = 0 (ordered low-temperature phase). The critical behavior can be related only to the non-trivial ﬁxed point (iii), (Kc∗ , L∗c ) = ( 13 , 19 ). That the initial values of K and L which lead to the ﬁxed point (Kc∗ , L∗c ) represent critical points can be seen in the following manner: the RG transformation leads to a lattice with its lattice constant increased by a factor of √ 2. The correlation length of the transformed system ξ is thus smaller by a √ factor of 2: √ (7.3.17) ξ = ξ/ 2 . However, at the ﬁxed point, the coupling constants Kc∗ , L∗c are invariant, so that for ξ of√the ﬁxed point, we have ξ = ξ , i.e. at the ﬁxed point, it follows that ξ = ξ/ 2, thus ∞ or ξ= (7.3.18) 0 . The value 0 corresponds to the high-temperature and to the low-temperature ﬁxed points. At ﬁnite K ∗ , L∗ , ξ cannot be zero, but only ∞. Calculating

7.3 The Renormalization Group

353

back through the transformation shows that the correlation length at each point along the critical trajectory which leads to the ﬁxed point is inﬁnite. Therefore, all the points of the “critical trajectory”, i.e. the trajectory leading to the ﬁxed point, are critical points of Ising models with nearest-neighbor and next-nearest-neighbor interactions. In order to determine the critical behavior, we examine the behavior of the coupling constants in the vicinity of the “non-trivial” ﬁxed point; to this end, we linearize the transformation equations (7.3.15a,b) around (Kc∗ , L∗c ) in the lth step: δKl = Kl − Kc∗

,

δLl = Ll − L∗c .

We thereby obtain the following linear recursion relation: ⎛ ⎞ ⎛ ∗ ⎞⎛ ⎞ ⎛4 ⎞⎛ ⎞ δKl δKl−1 δKl−1 4Kc 1 3 1 ⎝ ⎠⎝ ⎠=⎝ ⎠ = ⎝ ⎠⎝ ⎠ . 2 δLl δLl−1 0 δL 2Kc∗ 0 l−1 3

(7.3.19)

(7.3.20)

The eigenvalues of the transformation matrix can be determined from λ2 − 43 λ − 23 = 0 , i.e. √ 1 1.7208 λ1,2 = (2 ± 10) = (7.3.21a) 3 −0.3874 . √ ´ ` The associated eigenvectors can be obtained from 4 − (2 ± 10) δK + 3δL = 0 , i.e. √ 10 − 2 δK and thus δL = ± 3 √ « « „ √ „ (7.3.21b) 10 − 2 10 + 2 and e2 = 1, − e1 = 1, 3 3 with the scalar product e1 · e2 =

1 3

.

We now start from an Ising model with coupling constants K0 and L0 (including the division by kT ). We ﬁrst expand the deviations of the initial coupling constants K0 and L0 from the ﬁxed point in the basis of the eigenvectors (7.3.21):

∗ K0 Kc = + c1 e1 + c2 e2 , (7.3.22) L0 L∗c with expansion coeﬃcients c1 and c2 . The decimation procedure is repeated several times; after l transformation steps, we obtain the coupling constants Kl and Ll :

∗ Kl Kc = + λl1 c1 e1 + λl2 c2 e2 . (7.3.23) Ll L∗c If the Hamiltonian H diﬀers from H ∗ only by an increment in the direction e2 , the successive application of the renormalization group transformation leads to the ﬁxed point, since |λ2 | < 1 (see Fig. 7.9).

354

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.9. Flow diagram based on the recursion relation (7.3.22), which is linearized around the nontrivial ﬁxed point (FP)

Let us now consider the original nearest-neighbor Ising model with the J coupling constant K0 ≡ kT and with L0 = 0, and ﬁrst determine the critical value Kc ; this is the value of K0 which leads to the ﬁxed point. The condition for Kc , from the above considerations, is given by

1

Kc 1 3 √ = 1 + 0 · e 1 + c2 . (7.3.24) 0 − 10+2 3 9 These two linear equations have the solution c2 =

1 √ , 3( 10 + 2)

and therefore Kc =

1 1 + √ = 0.3979 . (7.3.25) 3 3( 10 + 2)

For K0 = Kc , the linearized RG transformation leads to the ﬁxed point, J i.e. this is the critical point of the nearest-neighbor Ising model, Kc = kT . c From the nonlinear recursion relation (7.3.15a,b), we ﬁnd for the critical point the slighty smaller value Kcn.l. = 0.3921. Both values diﬀer from Onsager’s exact solution, which gives Kc = 0.4406, but they are much closer than the value from molecular ﬁeld theory, Kc = 0.25. For K0 = Kc , only c2 = 0, and the transformation leads to the ﬁxed point. For K0 = Kc , we also have c1 ∝ (K0 − Kc ) = − kTJ 2 (T − Tc ) · · · = 0. c This increases with each application of the RG transformation, and thus leads away from the ﬁxed point (Kc∗ , L∗c ) (Fig. 7.9), so that the ﬂow runs either to the low-temperature ﬁxed point (for T < Tc ) or to the high-temperature ﬁxed point (for T > Tc ). Now we may determine the critical exponent ν for the correlation length, beginning with the recursion relation (K − Kc ) = λ1 (K − Kc ) and writing λ1 as a power of the new length scale √ y1 λ1 = ( 2) . For the exponent y1 deﬁned here, we ﬁnd the value

(7.3.26)

(7.3.27)

7.3 The Renormalization Group

355

log λ1 = 1.566 . (7.3.28) log 2 √ −ν −ν √ From ξ = ξ/ 2 (Eq. (7.3.17)), it follows that (K −Kc) = (K −Kc) / 2, i.e. √ 1 (K − Kc ) = ( 2) ν (K − Kc ) . (7.3.29) y1 = 2

Comparing this with the ﬁrst relation (7.3.26), we obtain ν=

1 = 0.638 . y1

(7.3.30)

This is, to be sure, quite a ways from 1, the known exact value of the twodimensional Ising model, but nevertheless it is larger than 0.5, the value from the molecular-ﬁeld approximation. A considerable improvement can be obtained by extending the recursion relation to several coupling coeﬃcients. Let us now consider the eﬀect of a ﬁnite magnetic ﬁeld h (including the factor β). The recursion relation can again be established intuitively. The ﬁeld h acts directly on the remaining spins, as well as a (somewhat underestimated) additional ﬁeld Kh which is due to the orienting action of the ﬁeld on the eliminated neighboring spins, so that all together we have h = h + Kh .

(7.3.31) ∗

The ﬁxed point value of this recursion relation is h = 0. Linearization around the ﬁxed point yields 4 h; 3 thus the associated eigenvalue is h = (1 + K ∗ )h =

(7.3.32)

4 . (7.3.33) 3 K0 − Kc (or T − Tc ) and h are called the relevant “ﬁelds”, since the eigenvalues λ1 and λh are larger than 1, and they therefore increase as a result of the renormalization group transformation and lead away from the ﬁxed point. In contrast, c2 is an “irrelevant ﬁeld”, since |λ2 | < 1, and therefore c2 becomes increasingly smaller with repeated RG transformations. Here, “ﬁelds” refers to ﬁelds in the usual sense, but also to coupling constants in the Hamiltonian. The structure found here is typical of models which describe critical points, and remains the same even when one takes arbitrarily many coupling constants into account in the transformation: there are two relevant ﬁelds (T − Tc and h, the conjugate ﬁeld to the order parameter), and all the other ﬁelds are irrelevant. λh =

We add a remark concerning the ﬂow diagram 7.9. There, owing to the negative sign of λ2 , only every other point is shown. This corresponds to a twofold application of the transformation and an increase of the lattice constant by a factor of 2, as well as λ1 → λ21 , λ2 → λ22 . Then the second eigenvalue λ22 is also positive, since otherwise the trajectory would move along an oscillatory path towards the ﬁxed point.

356

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.3.4 Scaling Laws Although the decimation procedure described in Sect. 7.3.3 with only a few parameters does not give quantitatively satisfactory results and is also unsuitable for the calculation of correlation functions, it does demonstrate the general structure of RG transformations, which we shall now use as a starting point for deriving the scaling laws. A general RG transformation R maps the original Hamiltonian H onto a new one, H = RH .

(7.3.34)

This transformation also implies the rescaling of all the lengths in the problem, and that N = N b√−d holds for the number of degrees of freedom N in d dimensions (here, b = 2 for the decimation transformation of 7.3.1). The ﬁxed-point Hamiltonian is determined by R(H∗ ) = H∗ .

(7.3.35)

For small deviations from the ﬁxed-point Hamiltonian, R(H∗ + δH) = H∗ + L δH , we can expand in terms of the deviation δH. From the expansion, we obtain the linearized recursion relation LδH = δH .

(7.3.36a)

The eigenoperators δH1 , δH2 , . . . of this linear transformation are determined by the eigenvalue equation LδHi = λi δHi .

(7.3.36b)

A given Hamiltonian H, which diﬀers only slightly from H∗ , can be represented by H∗ and the deviations from it: ci δHi , (7.3.37) H = H∗ + τ δHτ + hδHh + i≥3

where δHτ and δHh denote the two relevant perturbations with |λτ | = byτ > 1 , |λh | = byh > 1 ;

(7.3.38)

c and the external they are related to the temperature variable τ = T −T Tc yj ﬁeld h, while |λj | = b < 1 and thus yj < 0 for j ≥ 3 are connected with the irrelevant perturbations.11 The coeﬃcients τ, h, and cj are called scaling

11

Compare the discussion following Eq. (7.3.33). The (only) irrelevant ﬁeld there is denoted by c2 . In the following, we assume that λi ≥ 0.

7.3 The Renormalization Group

357

ﬁelds. For the Ising model, δHh = l σl . Denoting the initial values of the ﬁelds by ci , we ﬁnd that the free energy transforms after l steps to FN (ci ) = FN/bdl (ci λli ) .

(7.3.39a)

For the free energy per spin, f (ci ) =

1 FN (ci ) , N

(7.3.39b)

we then ﬁnd in the linear approximation f (τ, h, c3 , . . .) = b−dl f τ byτ l , hbyh l , c3 by3 l , . . . .

(7.3.40)

Here, we have left oﬀ an additive term which has no inﬂuence on the following derivation of the scaling law; it is, however, important for the calculation of the free energy. The scaling parameter l can now be chosen in such a way that |τ |byτ l = 1, which makes the ﬁrst argument of f equal to ±1. Then we ﬁnd d/y −y /y |y |/y f (τ, h, c3 , . . .) = |τ | τ fˆ± h|τ | h τ , c3 |τ | 3 τ , . . . , (7.3.40 ) where fˆ± (x, y, . . .) = f (±1, x, y, . . .) and yτ , yh > 0, y3 , . . . < 0. Close to Tc , the dependence on the irrelevant ﬁelds c3 , . . . can be neglected, and Eq. (7.3.40) then takes on precisely the scaling form (Eq. 7.2.7), with the conventional exponents βδ = yh /yτ

(7.3.41a)

and 2−α=

d . yτ

(7.3.41b)

Taking the derivative with respect to h yields β=

d − yh yτ

and γ =

d − 2yh . yτ

(7.3.41c,d)

We have thus derived the scaling law, Eq. (7.2.7), within the RG theory for ﬁxed points with just one relevant ﬁeld, along with the applied magnetic ﬁeld and the irrelevant operators. Furthermore, the dependence on the irrelevant ﬁelds c3 , . . . gives rise to corrections to the scaling laws, which must be taken into account for temperatures outside the asymptotic region. In order to make the connection between yτ and the exponent ν, we recall that l iterations reduce the correlation length to ξ = b−l ξ, which implies that (τ byτ l )−ν = b−l τ −ν and, as a result, ν=

1 yτ

(7.3.41e)

358

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.10. The critical hypersurface. A trajectory within the critical hypersurface is shown as a dashed curve. The full curve is a trajectory near the critical hypersurface. The coupling coeﬃcients of a particular physical system as a function of the temperature are indicated by the long-dashed curve

(cf. Eq. (7.3.30) for the two-dimensional Ising model). From the existence of a ﬁxed-point Hamiltonian with two relevant operators, the scaling form of the free energy can be derived, and it is also possible to calculate the critical exponents. Even the form of the scaling functions fˆ and m ˆ can be computed with perturbation-theoretical methods, since the arguments are ﬁnite. A similar procedure can be applied to the correlation function, Eq. (7.2.12b ). At this point it is important to renormalize the spin variable, σ = bζ σ, whereby it is found that setting the value ζ = (d − 2 + η)/2

(7.3.41f)

guarantees the validity of (7.2.13) at the critical point. We add a few remarks about the generic structure of the ﬂow diagram in the vicinity of a critical ﬁxed point (Fig. 7.10). In the multidimensional space of the coupling coeﬃcients, there is a direction (the relevant direction) which leads away from the ﬁxed point (we assume that h = 0). The other eigenvectors of the linearized RG transformation span the critical hypersurface. Further away from the ﬁxed point, this hypersurface is no longer a plane, but instead is curved. The trajectories from each point on the critical hypersurface lead to the critical ﬁxed point. When the initial point is close to but not precisely on the critical hypersurface, the trajectory at ﬁrst runs parallel to the hypersurface until the relevant portion has become suﬃciently large so that ﬁnally the trajectory leaves the neighborhood of the critical hypersurface and heads oﬀ to either the high-temperature or the low-temperature ﬁxed point. For a given physical system (ferromagnet, liquid, . . .), the parameters τ, c3 , . . . depend on the temperature (the long-dashed curve in Fig. 7.10). The temperature at which this curve intersects the critical hypersurface is the transition temperature Tc . From this discussion, the universality properties should be apparent. All systems which belong to a particular part of the parameter space, i.e. to the region of attraction of a given ﬁxed point, are described by the same power laws in the vicinity of the critical hypersurface of the ﬁxed point.

7.3 The Renormalization Group ∗

359

7.3.5 General RG Transformations in Real Space

A general RG transformation in real space maps a particular spin system {σ} with the Hamiltonian H{σ}, deﬁned on a lattice, onto a new spin system with fewer degrees of freedom (by N /N = b−d ) and a new Hamiltonian H {σ }. It can be represented by a transformation T {σ , σ}, such that T {σ , σ}e−H{σ} (7.3.42) e−G−H {σ } = {σ}

with the conditions H {σ } = 0

(7.3.43a)

{σ }

and

T {σ , σ} = 1 ,

(7.3.43b)

{σ }

which guarantee that

(7.3.44a) e−G Tr {σ } e−H {σ } = Tr {σ} e−H{σ} is fulﬁlled (Tr {σ} ≡ {σ} ). This yields a relation between the free energy F of the original lattice and the free energy F of the primed lattice: F + G = F .

(7.3.44b)

The constant G is independent of the conﬁguration of the {σ } and is determined by equation (7.3.43a). Important examples of such transformations are decimation transformations, as well as linear and nonlinear block-spin transformations. The simplest realization consists of 1 T {σ , σ} = Πi ∈Ω (1 + σi ti (σ)) , 2

(7.3.45)

where Ω denotes the lattice sites of the initial lattice and Ω those of the new lattice, and the function ti (σ) determines the nature of the transformation. α) Decimation Transformation (Fig. 7.11) ti {σ} = ζσi √ b= 2 ,

ζ = b(d−2+η)/2

,

where ζ rescales the amplitude of the remaining spins. Then, σx σ0 = ζ 2 σx σ0 .

(7.3.46a)

360

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.11. A decimation transformation

β) Linear Block-Spin Transformation (on a triangular lattice, Fig.7.12) ti {σ} = p(σi1 + σi2 + σi3 ) √ 1 √ η/2 = 3−1+η/4 . b = 3 , p = ( 3) 3

(7.3.46b)

Fig. 7.12. A block-spin transformation

γ) Nonlinear Block-Spin Transformation ti {σ} = p(σi1 + σi2 + σi3 ) + qσi1 σi2 σi3 .

(7.3.46c)

An important special case p = −q =

1 , 2

σi = sgn(σi1 + σi2 + σi3 ) .

These so called real-space renormalization procedures were introduced by Niemeijer and van Leeuwen12 . The simpliﬁed variant given in Sect. 7.3.3 is from13 . The block-spin transformation for a square Ising lattice is described in14 . For a detailed discussion with additional references, we refer to the article by Niemeijer and van Leeuwen15 . 12 13 14 15

Th. Niemeijer and J. M. J. van Leeuwen, Phys. Rev. Lett. 31, 1411 (1973). K. G. Wilson, Rev. Mod. Phys. 47, 773 (1975). M. Nauenberg and B. Nienhuis, Phys. Rev. Lett. 33, 344 (1974). Th. Niemeijer and J. M. J. van Leeuwen, in Phase Transitions and Critical Phenomena Vol. 6, Eds. C. Domb and M. S. Green, p. 425, Academic Press, London 1976.

∗

∗

7.4 The Ginzburg–Landau Theory

361

7.4 The Ginzburg–Landau Theory

7.4.1 Ginzburg–Landau Functionals The Ginzburg–Landau theory is a continuum description of phase transitions. Experience and the preceding theoretical considerations in this chapter show that the microscopic details such as the lattice structure, the precise form of the interactions, etc. are unimportant for the critical behavior, which manifests itself at distances which are much greater than the lattice constant. Since we are interested only in the behavior at small wavenumbers, we can go to a macroscopic continuum description, roughly analogous to the transition from microscopic electrodynamics to continuum electrodynamics. In setting up the Ginzburg–Landau functional, we will make use of an intuitive justiﬁcation; a microscopic derivation is given in Appendix E. (see also problem 7.15). We start with a ferromagnetic system consisting of Ising spins (n = 1) on a d-dimensional lattice. The generalization to arbitrary dimensions is interesting for several reasons. First, it contains the physically relevant dimensions, three and two. Second, it may be seen that certain approximation methods are exact above four dimensions; this gives us the possibility of carrying out perturbation expansions around the dimension four (Sect. 7.4.5). Instead of the spins Sl on the lattice, we introduce a continuum magnetization 1 m(x) = g(x − xl )Sl . (7.4.1) ˜ ad N 0 l Here, g(x − xl ) is a weighting function, which is equal to one within a cell ˜ spins and is zero outside it. The linear dimension of this cell, ac , is with N supposed to be much larger than the lattice constant a0 but much smaller than the length L of the crystal, i.e. a0 ac L. The function g(x − xl ) is assumed to vary continuously from the value 1 to 0, so that m(x) is a continuous function of x; see Fig. 7.13.

Fig. 7.13. The weighting function g(y) along one of the d cartesian coordinates

Making use of ˜ d dd xg(x − xl ) = Na 0

362

7. Phase Transitions, Renormalization Group Theory, and Percolation

and of the deﬁnition (7.4.1), we can rewrite the Zeeman term as follows: 1 d d xg(x − xl )Sl = dd xhm(x) . hSl = h (7.4.2) ˜ ad N 0 l l From the canonical density matrix for the spins, we obtain the probability density for the conﬁgurations m(x). Generally, we have P[m(x)] = δ m(x) −

1 . g(x − x )S l l ˜ ad N 0 l

(7.4.3)

For P[m(x)], we write P[m(x)] ∝ e−F [m(x)]/kT ,

(7.4.4)

in which the Ginzburg–Landau functional F [m(x)] enters; it is a kind of Hamiltonian for the magnetization m(x). The tendency towards ferromagnetic ordering due to the exchange interaction must express itself in the form of the functional F [m(x)] 2 b F [m(x)] = dd x am2 (x) + m4 (x) + c ∇m(x) − hm(x) . (7.4.5) 2 In the vicinity of Tc , only conﬁgurations of m(x) with small absolute values should be important, and therefore the Taylor expansion (7.4.5) should be allowed. Before we turn to the coeﬃcients in (7.4.5), we make a few remarks about the signiﬁcance of this functional. Due to the averaging (7.4.1), short-wavelength variations of Sl do not contribute to m(x). The long-wavelength variations, however, with wavelengths larger than az , are reﬂected fully in m(x). The partition function of the magnetic system therefore has the form Z = Z0 (T ) D[m(x)]e−F [m(x)]/kT . (7.4.6) Here, the functional integral D[m(x)] . . . refers to a sum over all the possible conﬁgurations of m(x) with the probability density e−F [m(x)]/kT . One can represent m(x) by means of a Fourier series, obtaining the sum over all conﬁgurations by integration over all the Fourier components. The factor Z0 (T ) is due to the (short-wavelength) conﬁgurations of the spin system, which do not contribute to m(x). The evaluation of the functional integral which occurs in the partition function (7.4.6) is of course a highly nontrivial problem and will be carried out in the following Sections 7.4.2 and 7.4.5 using approximation methods. The free energy is F = −kT log Z .

(7.4.7)

∗

7.4 The Ginzburg–Landau Theory

363

We now come to the coeﬃcients in the expansion (7.4.5). First of all, this expansion took into account the fact that F [m(x)] has the same symmetry as the microscopic spin Hamiltonian, i.e. aside from the Zeeman term, F [m(x)] is an even function of m(x). Owing to (7.4.2), the ﬁeld h expresses itself only in the Zeeman term, − dd x h m(x), and the coeﬃcients a, b, c are independent of h. For reasons of stability, large values of m(x) must have a small statistical weight, which requires that b > 0. If for some system b ≤ 0, the expansion must be extended to higher orders in m(x). These circumstances occur in ﬁrst-order phase transitions and at tricritical points. The ferromagnetic exchange interaction has a tendency to orient the spins uniformly. This leads to the term c∇m∇m with c > 0, which suppresses inhomogeneities in the magnetization. Finally, we come to the values of a. For h = 0 and a uniform m(x) = m, the probability weight e−βF is shown in Fig. 7.14.

Fig. 7.14. The probability density e−βF as a function of a uniform magnetization. (a) For a > 0 (T > Tc0 ) and (b) for a < 0 (T < Tc0 )

When a > 0, then the most probable conﬁguration is m = 0; when a < 0, then the most probable conﬁguration is m = 0. Thus, a must change its sign, a = a (T − Tc0 ) ,

(7.4.8)

in order for the phase transition to occur. Due to the nonlinear terms and to ﬂuctuations, the real Tc will diﬀer from Tc0 . The coeﬃcients b and c are ﬁnite at Tc0 . If one starts from a Heisenberg model instead of from an Ising model, the replacements Sl → Sl

and m(x) → m(x) 2 2 m4 (x) → m(x)

,

2

(∇m) → ∇α m∇α m .

(7.4.9)

must be made, leading to Eq. (7.4.10). Ginzburg–Landau functionals can be introduced for every type of phase transition. It is also not necessary to

364

7. Phase Transitions, Renormalization Group Theory, and Percolation

attempt a microscopic derivation: the form is determined in most cases from knowledge of the symmetry of the order parameter. Thus, the Ginzburg– Landau theory was ﬁrst applied to the case of superconductivity long before the advent of the microscopic BCS theory. The Ginzburg–Landau theory was also particularly successful in treating superconductivity, because here simple approximations (see Sect. 7.4.2) are valid even close to the transition (see also Sect. 7.4.4). 7.4.2 The Ginzburg–Landau Approximation We start with the Ginzburg–Landau functional for an order parameter with n components, m(x), n = 1, 2, . . . ,: 1 2 F [m(x)] = dd x am2 (x)+ b(m(x)2 ) + c(∇m)2 − h(x)m(x) . (7.4.10) 2 The integration extends over a volume Ld . The most probable conﬁguration of m(x) is given by the stationary state which is determined by δF 2 = 2 a + bm(x) − c∇2 m(x) − h(x) = 0 . δm(x)

(7.4.11)

Let h be independent of position and let us take h to lie in the x1 -direction without loss of generality, h = he1 , (h ≷ 0); then the uniform solution is found from 2 a + bm2 m − he1 = 0 . (7.4.12) We discuss special cases: (i) h → 0 : spontaneous magnetization and speciﬁc heat When there is no applied ﬁeld, (7.4.12) has the following solutions: m=0

for

a>0

(m = 0) and m = ±e1 m0 ,

m0 =

−a b

(7.4.13) for a < 0 .

The (Gibbs) free energy for the conﬁgurations (7.4.13) is16 F (T, h = 0) = F [0] = 0 F (T, h = 0) = F [m0 ] = − 16

for T > Tc0

(7.4.14a)

for T < Tc0 .

(7.4.14b)

2

1a d L 2 b

R Instead of really computing the functional integral D[m(x)]e−F [m(x)]/kT as is required by (7.4.6) and (7.4.7) for the determination of the free energy, m(x) was replaced everywhere by its most probable value.

∗

7.4 The Ginzburg–Landau Theory

365

We will always leave oﬀ the regular term Freg = −kT log Z0 . The state m = 0 would have a higher free energy for T < Tc0 than the state m0 ; therefore, m = 0 was already put in parentheses in (7.4.13). For T < Tc0 , we thus ﬁnd a ﬁnite spontaneous magnetization. The onset of this magnetization is characterized by the critical exponent β, which here takes on the value β = 12 (Fig. 7.15).

Fig. 7.15. The spontaneous magnetization in the Ginzburg–Landau approximation

Speciﬁc Heat From (7.4.14a,b), we immediately ﬁnd the speciﬁc heat

2 0 ∂S ∂ F d L ch=0 = T = −T = 2 2 ∂T h=0 ∂T h=0 T ab Ld

T > Tc0 , (7.4.15) T < Tc0

with a from (7.4.8). The speciﬁc heat exhibits a jump a , b 2

∆ch=0 = Tc0

(7.4.16)

and the critical exponent α is therefore zero (see Eq. (7.1.1)), α = 0. (ii) The equation of state for h > 0 and the susceptibility We decompose m into a longitudinal part, e1 m1 , and a transverse part, m⊥ = (0, m2 , ..., mn ). Evidently, Eq. (7.4.12) gives m⊥ = 0

(7.4.17)

and the magnetic equation of state h = 2(a + bm21 )m1 .

(7.4.18)

We can simplify this in limiting cases: α) T = Tc0 h = 2bm31

i.e. δ = 3 .

(7.4.19)

β) T > Tc0 m1 =

h + O(h3 ) . 2a

(7.4.20)

366

7. Phase Transitions, Renormalization Group Theory, and Percolation

γ) T < Tc0 m1 = m0 sgn (h) + ∆m yields h + O h2 sgn(h) m1 = m0 sgn (h) + 4bm20 h + O h2 sgn(h) . = m0 sgn(h) + −4a

(7.4.21)

We can now also calculate the magnetic susceptibility for h = 0, either by diﬀerentiating the equation of state (7.4.18) ∂m1 2 a + 3bm21 =1 ∂h or directly, by inspection of (7.4.20) and (7.4.21). It follows that the isothermal susceptibility is given by

1 T > Tc0 ∂m1 χT = = 2a1 . (7.4.22) ∂h T T < Tc0 4|a| The critical exponent γ has, as in molecular ﬁeld theory, a value of γ = 1. 7.4.3 Fluctuations in the Gaussian Approximation 7.4.3.1 Gaussian Approximation Next we want to investigate the inﬂuence of ﬂuctuations of the magnetization. To this end, we ﬁrst expand the Ginzburg–Landau functional in terms of the deviations from the most probable state up to second order m(x) = m1 e1 + m (x) ,

(7.4.23)

where m (x) = L−d/2

mk eikx

(7.4.24)

k∈B

characterizes the deviation from the most probable value. Because of the underlying cell structure, the summation over k is restricted to the Brillouin zone B : − aπc < ki < aπc . The condition that m(x) be real yields m∗k = m−k .

(7.4.25)

A) T > Tc0 and h = 0 : In this region, m1 = 0, and the Fourier series (7.4.24) diagonalizes the harmonic part Fh of the Ginzburg–Landau functional

∗

Fh =

7.4 The Ginzburg–Landau Theory

2 2 dd x am + c(∇m ) = (a + ck 2 )mk m−k .

367

(7.4.26)

k

We can now readily calculate the partition function (7.4.6) in the Gaussian approximation above Tc0 : ZG = Z0 dmk e−βFh . (7.4.27) k

We decompose mk into real and imaginary parts, ﬁnding for each k and each of the n components of mk a Gaussian integral, so that n π (7.4.28) ZG = Z0 β(a + ck 2 ) k

results, and thus the free energy (the stationary solution m1 = 0 makes no contribution) is F (T, 0) = F0 − kT

n π . log 2 β(a + ck 2 )

(7.4.29)

k

The speciﬁc heat, using ch=0 = −T

k

··· =

V (2π)d

∂ 2 F/Ld n 2 = k (T a ) ∂T 2 2

dd k . . . and Eq. (7.4.8), is then dd k

1

2 d (2π) (a + ck 2 )

+ ... .

The dots stand for less singular terms. We deﬁne the quantity c c 1/2 −1/2 = (T − Tc0 ) , ξ= a a

(7.4.30)

(7.4.31)

which diverges in the limit T → Tc0 and will be found to represent the correlation length in the calculation of the correlation function (7.4.47). By introducing q = ξk into (7.4.30) as a new integration variable, we ﬁnd the singular part of the speciﬁc heat ˜ 4−d csing. h=0 = A+ ξ

(7.4.32)

with the amplitude

2 dd q 1 n T a ˜ A+ = k 2 . d 2 c q 4 must be distinguished: d4 Z

„

Λξ

dq Z

0 Λξ

=−

dq 0

q d−1 − q d−5 (1 + q 2 )2

«

Z

Λξ

dq q d−5

+ 0

q d−5 + 2q d−3 1 + (Λξ)d−4 . d−4 (1 + q 2 )2

The overall result is summarized ⎧ 4−d 0 − 2 ⎪ ⎪ ⎨A+ (T − Tc ) 0 csing h=0 = ⎪∼ log(T − Tc ) ⎪ ⎩A − B(T − T 0 ) d−4 2 c

in (7.4.35): d4.

For d ≤ 4, the speciﬁc heat diverges at Tc ; for d > 4, it exhibits a cusp. The amplitude A+ for d < 4 is given by A+ =

d ∞ n 2 a 2 q d−1 T Kd dq 2 . 2 c (1 + q 2 ) 0

(7.4.36)

Below d = 4, the critical exponent of the speciﬁc heat is (ch=0 ∼ (T − Tc ) α=

1 (4 − d) ; 2

−α

)

(7.4.37)

∗

7.4 The Ginzburg–Landau Theory

369

in particular, for d = 3 in the Gaussian approximation, α = 12 . Comparison with exact results and experiments shows that the Gaussian approximation overestimates the ﬂuctuations. B) T < Tc0 Now we turn to the region T < Tc0 and distinguish between the longitudinal (m1 ) and the transverse components (mi ) m1 (x) = m1 + m1 (x) ,

mi (x) = mi (x)

for

i≥2

(7.4.38)

with the Fourier components m1k and mik , where the latter are present only for n ≥ 2. In the present context, including non-integer values of d, vectors will be denoted by just x, etc. From (7.4.10), we ﬁnd for the Ginzburg–Landau functional in second order in the ﬂuctuations: 3h 2 Fh [m] = F [m1 ] + −2a + + O(h2 ) + ck 2 |m1k | 2m1 k (7.4.39) h 2 + + ck 2 |mik | . 2m1 i≥2

To arrive at this expression, the following ancillary calculation was used: ” “ a m21 + 2m1 m1 + m12 + m2⊥ ” b“ 4 + m1 + 4m31 m1 + 6m21 m12 + 2m21 m2⊥ − h(m1 + m1 ) 2 ´ ´ ` ` b = am21 + m41 − hm1 + a + 3bm21 m12 + a + bm21 m2⊥ . 2 | {z } h 2m1

Analogously to the computation leading from (7.4.26) to (7.4.29), we ﬁnd for the free energy of the low-temperature phase at h = 0 F (T, 0) = F0 (T, h) + FG.L. (T, 0)− ﬀ Xj π π 1 + (n − 1) log log . − kT 2 β(2|a| + ck2 ) βck2 k

(7.4.40)

The ﬁrst term results from Z0 ; the second from F [m1 ], the stationary solution considered in the Ginzburg–Landau approximation; the third term from the longitudinal ﬂuctuations; and the fourth from the transverse ﬂuctuations. The latter do not contribute to the speciﬁc heat, since their energy is temperature independent for h = 0:

ch=0

a2 a2 − 4−d + A˜− ξ 4−d = T + A− (Tc − T ) 2 =T b b

where the low-temperature correlation length

,

(7.4.41)

370

7. Phase Transitions, Renormalization Group Theory, and Percolation

) ξ=

2|a| c

*−1 =

c 1/2 −1/2 (Tc0 − T ) , 2a

T < Tc0

(7.4.42)

is to be inserted. The amplitudes in (7.4.23) and (7.4.41) obey the relations 4 A˜− = A˜+ , n

A− =

2d/2 A+ . n

(7.4.43)

The ratio of the amplitudes of the singular contribution to the speciﬁc heat depends only on the number of components n and the spatial dimension d, and is in this sense universal. The transverse ﬂuctuations do not contribute to the speciﬁc heat below Tc ; therefore, the factor n1 enters the amplitude ratio. 7.4.3.2 Correlation Functions We now calculate the correlation functions in the Gaussian approximation. We start by considering T > Tc0 . In order to calculate this type of quantity, with which we shall meet up repeatedly later, we introduce the generating functional P 1 Z[h] = dmk e−βFh + hk m−k ZG k (7.4.44) P 2 2 1 = dmk e−β k (a+ck )|mk | +hk m−k . ZG k

To evaluate the Gaussian integrals in (7.4.44), we introduce the substitution m ˜ k = mk −

1 −1 (a + ck 2 ) hk , 2β

(7.4.45)

obtaining 1 1 Z[h] = exp hk h−k . 4β a + ck 2

(7.4.46)

k

Evidently, mk m

−k

∂2 = Z[h] ∂h−k ∂hk h=0

,

from which we ﬁnd the correlation function by making use of (7.4.46): mk m−k = δkk

1 ≡ δk,k G(k) . 2β(a + ck 2 )

(7.4.47)

∗

7.4 The Ginzburg–Landau Theory

371

Here, we have taken into account the fact that in the sum over k in (7.4.46), each term hk h−k = h−k hk occurs twice. From the last equation, the meaning of the correlation length (7.4.31) becomes clear, since in real space, Eq. (7.4.47) gives 1 eik(x−x ) dd k 1 ik(x−x ) m(x)m(x ) = d = e d L 2β(a + ck 2 ) (2π) 2βc(ξ −2 + k 2 ) k ξ 2−d dd q eiq(x−x )/ξ . = 2βc q 1 between the longitudinal correlation function and the transverse (i ≥ 2) correlation function: G (k) = m1k m1−k and G⊥ (k) = mik mi−k . (7.4.52) For n = 1, only G (k) is relevant. From (7.4.39), it follows in analogy to (7.4.47) that G (k) =

1 2β[−2a +

3h 2m1

h→0

+ ck 2 ]

−→

1 2β[2a (Tc0 − T ) + ck 2 ]

(7.4.53)

and G⊥ (k) =

1 1 h→0 −→ h 2] 2βck 2 2β[ 2m + ck 1

(7.4.54a)

G⊥ (0) =

T m1 . h

(7.4.54b)

372

7. Phase Transitions, Renormalization Group Theory, and Percolation

The divergence of the transverse susceptibility (correlation function) (7.4.54a) at h = 0 is a result of rotational invariance, owing to which it costs no energy to rotate the magnetization. We ﬁrst want to summarize the results of the Gaussian approximation, then treat the limits of its validity, and ﬁnally, in Sect. 7.4.4.1, to discuss the form of the correlation functions below Tc0 in a more general way. In summary for the critical exponents, we have: αFluct = 2 −

1 1 d , β = , γ =1, δ =3, ν = , η =0 2 2 2

(7.4.55)

and for the amplitude ratios of the speciﬁc heat, the longitudinal correlation function and the isothermal susceptibility: A˜+ n = , ˜ 4 A−

C˜+ = 1, C˜−

and

C+ =2. C−

(7.4.56)

The amplitudes are deﬁned in (7.4.32), (7.4.41), (7.4.57), and (7.4.58): G(k) = C˜±

ξ2 1 + (ξk) −1

χ = C± |T − Tc |

1 , C˜± = 2βc

,

2

(7.4.57)

T ≷ Tc .

,

(7.4.58)

7.4.3.3 Range of Validity of the Gaussian Approximation The range of validity of the Gaussian approximation and of more elaborate perturbation-theoretical calculations can be estimated by comparing the higher orders with lower orders. For example, the fourth order must be much smaller than the second, or the Gaussian contribution to the speciﬁc heat must be smaller than the stationary value. The Ginzburg–Landau approximation is permissible if the ﬂuctuations are small compared to the stationary value, i.e. from Eqns. (7.4.16) and (7.4.41),

∆c ξ 4−d

T a c

2 N

,

(7.4.59)

where N is a numerical factor. Then we require that τ (4−d)/2 with τ =

T −Tc0 Tc0

N ξ0d ∆c and ξ0 =

(7.4.60) 9

c a Tc0 .

For dimensions d < 4, the Ginzburg–Landau approximation fails near Tc0 . From (7.4.60), we ﬁnd a characteristic temperature τGL = ( ξdN∆c )2/(4−d) , 0

∗

7.4 The Ginzburg–Landau Theory

373

Table 7.3. The correlation length and the critical region Superconductors17 Magnets λ−Transition

ξ0 ∼ 103 ˚ A ˚ ξ0 ∼ A ˚ ξ0 ∼ 4 A

τGL = 10−10 − 10−14 τGL ∼ 10−2 τGL ∼ 0.3

the so called Ginzburg–Levanyuk temperature; it depends on the Ginzburg– Landau parameters (see Table 7.3). In this connection, dc = 4 appears as a limiting dimension (upper critical dimension). For d < 4, the Ginzburg–Landau approximation fails when τ < τGL . It is then no longer suﬃcient to add the ﬂuctuation contribution; instead, one has to take interactions between the ﬂuctuations into account. Above four dimensions, the corrections to the Gaussian approximation on approaching Tc0 become smaller, so that there, the Gaussian approximation applies. For d > 4, the exponent of the ﬂuctuation contribution is negative, c (T 0 ) from Eq. (7.4.35): αFluct < 0. Then the ratio can be h=0∆c c ≷ 1. 7.4.4 Continuous Symmetry and Phase Transitions of First Order 7.4.4.1 Susceptibilities for T < Tc A) Transverse Susceptibility We found for the transverse correlation function(7.4.54a) that G⊥ (k) = 1 1 and we now want to show that the relation G⊥ (0) = T m is h 2β[ h +ck2 ] 2m1

a general result of rotational invariance. To this end, we imagine that an external ﬁeld h acts on a ferromagnet. Now we investigate the inﬂuence of an additional inﬁnitesimal, transverse ﬁeld δh which is perpendicular to h 9 δh2 + .... (h + δh)2 = h2 + δh2 = h + 2h Thus, the magnitude of the ﬁeld is changed by only O(δh2 ); for a small δh, this is equivalent to a rotation of the ﬁeld through the angle δh h (Fig. 7.16).

Fig. 7.16. The ﬁeld h and the additional, inﬁnitesimal transverse ﬁeld δh 17

F According to BCS theory, ξ0 ∼ 0.18 v . In pure metals, m = me , vF = 108 cm , Tc kTc s is low, ξ0 = 1000 − 16.000 ˚ A. The A-15 compounds Nb3 Sn and V3 Ga have ﬂat , Tc is higher, and ξ0 = 50 ˚ A. The situation bands, so that m is large, vF = 106 cm s A. is diﬀerent in high-Tc superconductors; there, ξ0 ∼ ˚

374

7. Phase Transitions, Renormalization Group Theory, and Percolation

The magnetization rotates through the same angle; this means that and we obtain for the transverse susceptibility, χ⊥ ≡

m δm = . δh h

δm m

=

δh h ,

(7.4.61)

The transverse correlation function in the Gaussian approximation (7.4.54a) is in agreement with this general result. Remarks concerning the spatial dependence of the transverse correlation function G⊥ (r): (i) G⊥ (r, h = 0) =

1 2βc

d−2 dd k eikx ξ⊥ = A , d r (2π)d k 2

ξ⊥ = (2βc)− d−2 1

(7.4.62) d−2

dθdΩd−1 , the integral Employing the volume element dd k = dkk d−1 (sin θ) in (7.4.62) becomes π Ωd−1 ∞ 1 d−2 d−1 dkk eikr cos θ (sin θ) dθ d 2 2βck (2π) 0 0

∞ J d −1 (kr) d 1 d 1 Kd−1 d−3 − Γ 2 2 −1 2 d dk k Γ = −1 2βc 2π 0 2 2 2 (kr) 2 ∼ r−(d−2) . 18 For dimensional reasons, G⊥ (r) must be of the form G⊥ (r) ∼ M 2

d−2 ξ , r

i.e. the transverse correlation length from Eq. (7.4.62) is 2β

ξ⊥ = ξM d−2 ∝ τ −ν τ d−2 = τ ην/(d−2) , 2

(7.4.63)

where the exponent was rearranged using the scaling relations. (ii) We also compute the local transverse ﬂuctuations of the magnetization from −d+2 q 2m1 Λ h cΛ q d−1 dk k d−1 2m1 G⊥ (r = 0) ∼ c ∼ dq h 2 h 1 + q2 0 0 2m + ck 1

18

I.S. Gradshteyn and I.M. Ryzhik, Table of Integrals, Series, and Products (Academic Press, New York 1980), Eq. 8.411.7

∗

7.4 The Ginzburg–Landau Theory

375

and consider the limit h → 0: the result is ﬁnite

for

d>2

log h

for

d=2

for

d < 2.

m 2−d 2 1

h

h→0

−→ ∞ if m1 = 0

For d ≤ 2, the transverse ﬂuctuations diverge in the limit h → 0. As a result, for d ≤ 2, we must have m1 = 0. B) Longitudinal Correlation Function In the Gaussian approximation, we found for T < Tc in Eq. (7.4.54a) that lim lim G (k) =

k→0 h→0

1 −4βa

as for n = 1 . In fact, one would expect that the strong transverse ﬂuctuations would modify the behavior of G (k). Going beyond the Gaussian approximation, we now calculate the contribution of orientation ﬂuctuations to the longitudinal ﬂuctuations. We consider a rotation of the magnetization at the point x and decompose the change δm into a component δm1 parallel and a vector δm⊥ perpendicular to m0 (Fig. 7.17). The invariance of the length yields the condition m20 = m20 + 2m0 δm1 + δm21 + (δm⊥ )2 ; and it follows from this owing to |δm1 | m0 that δm1 = −

1 (δm⊥ )2 . 2m0

(7.4.64)

Fig. 7.17. The rotation of the spontaneous magnetization in isotropic systems

For the correlation of the longitudinal ﬂuctuations, one obtains from this the following relation to the transverse ﬂuctuations: δm1 (x)δm1 (0) =

1 δm⊥ 2 (x)δm⊥ 2 (0) . 2 4m0

(7.4.65)

We now factor this correlation function into the product of two transverse correlation functions, Eq. (7.4.54a), and obtain from it the Fourier-transformed longitudinal correlation function

376

7. Phase Transitions, Renormalization Group Theory, and Percolation

G (k = 0) =

√ m1 d−4

h d e−2r/ h d x (d−2)2 ∼ ∼ h 2 −2 m1 r d

.

(7.4.66)

In three dimensions, we ﬁnd from this for the longitudinal susceptibility kT

1 ∂m1 = G (k = 0) ∼ h− 2 . ∂h

(7.4.66 ) 1

In the vicinity of the critical point Tc , we found m ∼ h δ (see just after Eq. (7.2.4b)); in contrast to (7.4.66), this yields δ−1 ∂m1 ∼ h− δ . ∂h

In isotropic systems, the longitudinal susceptibility is not just singular only in the critical region, but instead in the whole coexistence region for h → 0 (cf. Fig. 7.18). This is a result of rotational invariance.

Fig. 7.18. Singularities in the longitudinal susceptibility in systems with internal rotational symmetry, n ≥ 2

C) Coexistence Singularities The term coexistence region denotes the region of the phase diagram with a ﬁnite magnetization in the limiting case of h → 0. The coexistence singularities found in (7.4.54a), (7.4.62), and (7.4.66) for isotropic systems are exactly valid. This can be shown as follows: for T < Tc0 , the Ginzburg–Landau functional can be written in the form 2

1 |a| |a|2 d 2 2 F [m] = d x b m − + (∇m) − hm − 2 b 2b

1 |a| 2 (7.4.67) = dd x b m21 + 2m1 m1 (x) + m1 (x)2 + m⊥ (x)2 − 2 b 2 2 |a|2 . + c ∇m1 (x) + c ∇m⊥ (x) − h m1 + m1 (x) − 2b In this expression, we have inserted (7.4.38) and have combined the components mi (x), i ≥ 2 into a vector of the transverse ﬂuctuations m⊥ (x) = (0, m2 (x), . . . , mn (x)). Using (7.4.18) and m1 (x) m1 , one obtains

∗

7.4 The Ginzburg–Landau Theory

1 h 2 2 F [m] = dd x b 2m1 m1 + m ⊥ + 2 2bm1 |a|2 2 2 . + c ∇m1 + c ∇m⊥ − h m1 + m1 − 2b

377

(7.4.68)

The terms which are nonlinear in the transverse ﬂuctuations are absorbed into the longitudinal terms by making the substitution m1 = m1 − F [m] = +

m ⊥ : 2m1 2

(7.4.69)

2 2 dd x 2bm21 m1 + c ∇m1

2 h h2 |a|2 2 . m ⊥ + c ∇m⊥ − hm1 + − 2 2m1 8bm1 2b

(7.4.70)

The ﬁnal result for the free energy is harmonic in the variables m1 and m⊥ . As a result, the transverse propagator in the coexistence region is given exactly by (7.4.54a). The longitudinal correlation function is m1 (x)m1 (0)C = m1 (x)m1 (0) +

1 m ⊥ (x)2 m ⊥ (0)2 C . 2 4m1 m 2

(7.4.71)

m 2

In equation (7.4.70), terms of the form (∇ m1⊥ )2 and ∇m1 ∇ m1⊥ have been neglected. The second term in (7.4.69) leads to a reduction of the order parameter m 2⊥ − 2m1 . Eq. (7.4.71) gives the cumulant, i.e. the correlation function of the deviations from the mean value. Since (7.4.70) now contains only harmonic terms, the factorization of the second term in the sum in (7.4.71) is exact, as used in Eq. (7.4.65). One could still raise the objection to the derivation of (7.4.71) that a number of terms were neglected. However, using renormalization group theory19 , it can be shown that the anomalies of the coexistence region are described by a low-temperature ﬁxed point at which m0 = ∞. This means that the result is asymptotically exact. 7.4.4.2 First-Order Phase Transitions There are systems in which not only the transition from one orientation of the order parameter to the opposite direction is of ﬁrst order, but also the transition at Tc . This means that the order parameter jumps at Tc from zero to a ﬁnite value (an example is the ferroelectric transition in BaTiO3 ). This situation can be described in the Ginzburg–Landau theory, if b < 0, 19

I. D. Lawrie, J. Phys. A14, 2489 (1981); ibid., A18, 1141 (1985); U. C. T¨ auber and F. Schwabl, Phys. Rev. B46, 3337 (1992).

378

7. Phase Transitions, Renormalization Group Theory, and Percolation

and if a term of the form 12 vm6 with v > 0 is added for stability. Then the Ginzburg–Landau functional takes the form 0 / 2 1 1 F = dd x am2 + c ∇m + bm4 + vm6 , (7.4.72) 2 2 where a = a (T − Tc0 ). The free energy density is shown in Fig. 7.19 for a uniform order parameter.

Fig. 7.19. The free energy density in the vicinity of a ﬁrst-order phase transition at temperatures T < Tc0 , T ≈ Tc0 , T = Tc , T < T1 , T > T1

For T > T1 , there is only the minimum at m = 0, that is the non-ordered state. At T1 , a second relative minimum appears, which for T ≤ Tc ﬁnally becomes deeper than that at m = 0. For T < Tc0 , the m = 0 state is unstable. The stationarity condition is v a + bm2 + 3 m4 m = 0 , (7.4.73) 2 and the condition that a minimum is present is v 1 ∂2f = a + 3bm2 + 15 m4 > 0 2 ∂m2 2

.

(7.4.74)

The solutions of the stationarity condition are m0 = 0

(7.4.75a)

and m20 = −

b + 3v (−)

b2 2a − 2 9v 3v

1/2 .

(7.4.75b)

We recall that b < 0. The nonzero solution with the minus sign corresponds to a maximum in the free energy and will be left out of further consideration.

∗

7.4 The Ginzburg–Landau Theory

379

The minimum (7.4.75b) exists for all temperatures for which the discriminant is positive, i.e. below the temperature T1 b2 . (7.4.76) 6va T1 is the superheating temperature (see Fig. 7.19 and below). The transition temperature Tc is found from the condition that the free energy for (7.4.75b) is zero. At this temperature (see Fig. 7.19), the free energy has a double zero at m2 = m20 and thus has the form 2 v 2 b v m − m20 m2 = a + m2 + m4 m2 2 2 2

2 b2 v b 2 m + + a m2 = 0 . = − 2 2v 8v T1 = Tc0 +

It follows from this that a = Tc = Tc0 +

b2 8v

b and m2 = − 2v , which both lead to

b2 . 8va

(7.4.77)

For T < Tc0 , there is a local maximum at m = 0. Tc0 plays the role of a supercooling temperature. In the range Tc0 ≤ T ≤ T1 , both phases can thus coexist, i.e. the supercooling or superheating of a phase is possible. Since for Tc0 ≤ T < Tc , the non-ordered phase (m0 = 0) is metastable; for T1 ≥ T > Tc , in contrast, the ordered phase (m0 = 0) is metastable. On slow cooling, so that the system attains the state of lowest free energy, m0 jumps at Tc from 0 to

2 1/2 b b b2 b 2 m0 (Tc ) = − + , (7.4.78) − =− 3v 9v 2 12v 2 2v and, below Tc , it has the temperature dependence (Fig. 7.20) : 0) (T − T 2 3 c . m20 (T ) = m20 (Tc ) 1 + 1 − 3 4 (Tc − Tc0 )

Fig. 7.20. The temperature dependence of the magnetization in a ﬁrstorder phase transition

380 ∗

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.4.5 The Momentum-Shell Renormalization Group

The RG theory can also be carried out in the framework of the G–L functional, with the following advantages compared to discrete spin models: the method is also practicable in higher dimensions, and various interactions and symmetries can be treated. One employs an expansion of the critical exponents in = 4 − d. Here, we cannot go into the details of the necessary perturbation-theoretical techniques, but rather just show the essential structure of the renormalization group recursion relations and their consequences. For the detailed calculation, the reader is referred to more extensive descriptions20,21 and to the literature at the end of this chapter. 7.4.5.1 Wilson’s RG Scheme We now turn to the renormalization group transformation for the Ginzburg– Landau functional (7.4.10). In order to introduce the notation which is usual in this context, we carry out the substitutions √ 1 m = √ φ , a = rc , b = uc2 and h → 2c h , (7.4.79) 2c and obtain the so called Landau–Ginzburg–Wilson functional: r u 1 F [φ] = dd x φ2 + (φ2 )2 + (∇φ)2 − hφ . 2 4 2

(7.4.80)

An intuitively appealing method of proceeding was proposed by K. G. Wilson20,21 . Essentially, the trace over the degrees of freedom with large k in momentum space is evaluated, and one thereby obtains recursion relations for the Ginzburg–Landau coeﬃcients. Since it is to be expected that the detailed form of the short-wavelength ﬂuctuations is not of great importance, the Brillouin zone can be approximated as simply a d-dimensional sphere of radius (cutoﬀ) Λ, Fig. 7.21.

Fig. 7.21. The momentum-space RG: the partial trace is performed over the Fourier components φk with momenta within the shell Λ/b < |k| < Λ 20 21

Wilson, K. G. and Kogut, J., Phys. Rep. 12 C, 76 (1974). S. Ma, Modern Theory of Critical Phenomena, Benjamin, Reading, 1976.

∗

7.4 The Ginzburg–Landau Theory

381

The momentum-shell RG transformation then consists of the following steps: (i) Evaluating the trace over all the Fourier components φk with Λ/b < |k| < Λ (Fig. 7.21) eliminates these short-wavelength modes. (ii) By means of a scale transformation22 k = bk ,

(7.4.81)

φ = bζ φ ,

(7.4.82)

and therefore φk = bζ−d φk ,

(7.4.83)

the resulting eﬀective Hamiltonian functional can be brought into a form resembling the original model, whereby eﬀective scale-dependent coupling parameters are deﬁned. Repeated application of this RG transformation (which represents a semigroup, since it has no inverse element) discloses the presumably universal properties of the long-wavelength regime. As in the realspace renormalization group transformation of Sect. 7.3.3, the ﬁxed points of the transformation correspond to the various thermodynamic phases and the phase transitions between them. The eigenvalues of the linearized ﬂow equations in the vicinity of the critical ﬁxed point ﬁnally yield the critical exponents (see (7.3.41a,b,c)). Although a perturbational expansion (in terms of u) is in no way justiﬁable in the critical region, it is completely legitimate at some distance from the critical point, where the ﬂuctuations are negligible. The important observation is now that the RG ﬂow connects these quite diﬀerent regions, so that the results of the perturbation expansion in the non-critical region can be transported to the vicinity of Tc , whereby the non-analytic singularities are consistently, controllably, and reliably taken into account by this mapping. Perturbation-theoretical methods can likewise be applied in the elimination of the short-wavelength degrees of freedom (step (i)). 7.4.5.2 Gaussian Model We will now apply the concept described in the preceding section ﬁrst to the Gaussian model, where u = 0 (see Sect. 7.4.3), 22

If one considers (7.4.83) together with the ﬁeld term in the Ginzburg–Landau functional (7.4.10), then it can be seen that the exponent ζ determines the transformation of the external ﬁeld and is related to yh from Sect. (7.3.4) via ζ = d − yh .

382

7. Phase Transitions, Renormalization Group Theory, and Percolation

F0 [φk ] =

|k| | < Λ, the result to ﬁrst order in u includes terms of the following (symbolically written) form (from now on, we set kT equal to 1): (i) u φ4< e−F0 must merely be re-exponentiated, since these degrees of freedom are not eliminated; (ii) all terms with an uneven number of φ< or φ> , such as for example u φ3< φ> e−F0 , vanish; (iii) u φ4> e−F0 makes a constant contribution to the free energy and ﬁnally to u φ2< φ2> e−F0 , for which the Gaussian integral over the φ> can be carried out with the aid of Eq. (7.4.47) for the propagator δkk β φα k> φ−k 0 = 2(r+k2 ) , an average value which is calculated with the >

statistical weight e−F0 .

∗

7.4 The Ginzburg–Landau Theory

383

Quite generally, Wick’s theorem20,21 states that expressions of the form m

≡ φk1 > φk2 > . . . φkm > 0

φki >

i

0

factorize into a sum of products of all possible pairs φk> φ−k> 0 if m is even, and otherwise they yield zero. Especially in the treatment of higher orders of perturbation theory, the Feynman diagrams oﬀer a very helpful representation of the large number of contributions which have to be summed in the perturbation expansion. In these diagrams, lines symbolize the propagators and interaction vertices stand for the nonlinear coupling u. With these means at our disposal, we can compute the two-point function φk< φ−k< and the similarly deﬁned four-point function.Using Eq. (7.4.47), one then obtains in the ﬁrst non-trivial order (“1-loop”, a notation which derives from the graphical representation) the following recursion relation between the initial coeﬃcients r, u and the transformed coeﬃcients r , u of the Ginzburg–Landau–Wilson functional20,21 : r = b2 r + (n + 2) A(r) u , (7.4.88) u = b4−d u 1 − (n + 8) C(r) u ,

(7.4.89)

where A(r) and C(r) refer to the integrals

Λ

(k d−1 /r + k 2 )dk

A(r) = Kd Λ/b

=Kd Λd−2 (1 − b2−d )/(d − 2) − rΛd−4 (1 − b4−d )/(d − 4) + O(r2 ) Λ d−1

k /(r + k 2 )2 dk C(r) = Kd Λ/b

= Kd Λ

d−4

(1 − b4−d )/(d − 4) + O(r) ,

with Kd = 1/2d−1π d/2 Γ (d/2), and the factors depending on the number n of components of the order parameter ﬁeld result from the combinatorial analysis in counting the equivalent possibilities for “contracting” the ﬁelds φk> , i.e. for evaluating the integrals over the large momenta. We note that here again, Eq. (7.4.85) applies. Linearizing equations (7.4.88) and (7.4.89) at the Gaussian ﬁxed point r∗ = 0, u∗ = 0, one immediately ﬁnds the eigenvalues yτ = 2 and yu = 4 − d. Then for d > dc = 4, the nonlinearity ∝ u is seen to be irrelevant, and the mean ﬁeld exponents are valid, as already surmised in Sect. 7.4.4. For d < 4 (dc = 4 is the upper critical dimension), the ﬂuctuations however become relevant and each initial value u = 0 increases under the renormalization group transformation. In order to obtain the scaling behavior in this case, we must therefore search for a ﬁnite, non-trivial ﬁxed point. This can be most easily done by introducing a diﬀerential ﬂow, with b = eδ and δ → 0,

384

7. Phase Transitions, Renormalization Group Theory, and Percolation

making the number of RG steps eﬀectively into a continuous variable, and studying the resulting diﬀerential recursion relations: dr(!) = 2r(!) + (n + 2)u(!)Kd Λd−2 − (n + 2)r(!)u(!)Kd Λd−4 , (7.4.90) d! du(!) = (4 − d)u(!) − (n + 8)u(!)2 Kd Λd−4 . (7.4.91) d! Now, a ﬁxed point is deﬁned by the condition dr/d! = 0 = du/d!.

Fig. 7.22. Flow of the eﬀective coupling u("), determined by the right-hand side of Eq. (7.4.91), which is plotted here as the ordinate. Both for initial values u0 > u∗c and 0 < u0 < u∗c , one ﬁnds u(") → u∗c for "→∞

Figure 7.22 shows the ﬂow of u(!) corresponding to Eq. (7.4.91); for any initial value u0 = 0, one ﬁnds that asymptotically, i.e. for ! → ∞, the nontrivial ﬁxed point u∗c Kd =

Λ , n+8

=4−d

(7.4.92)

is approached; this should determine the universal critical properties of the model. As in the real-space renormalization in Sect. 7.3, the RG transformation via momentum-shell elimination generates new interactions; for example, terms ∝ φ6 and ∇2 φ4 , etc., which again inﬂuence the recursion relations for r and u in the succeeding steps. It turns out, however, that up to order 3 , these terms do not have to be taken into account.20,21 The original assumption that u should be small, which justiﬁed the perturbation expansion, now means in light of Eq. (7.4.92) that the eﬀective expansion parameter here is the deviation from the upper critical dimension, . If one inserts (7.4.92) into Eq. (7.4.90) and includes terms up to O(), the result is rc∗ = −

n+2 ∗ (n + 2) 2 uc Kd Λd−2 = − Λ . 2 2(n + 8)

(7.4.93)

The physical interpretation of this result is that ﬂuctuations lead to a lowering of the transition temperature. With τ = r − rc∗ , the diﬀerential form of the ﬂow equation dτ (!) = τ (!) 2 − (n + 2) u Kd Λd−4 d!

(7.4.94)

∗

7.4 The Ginzburg–Landau Theory

385

ﬁnally yields the eigenvalue yτ = 2 − (n + 2)/(n + 8) in the vicinity of the critical point (7.4.92). In the one-loop order which we have described here, O(), one therefore ﬁnds for the critical exponent ν from Eq. (7.3.41e) ν=

1 n+2 + + O(2 ) . 2 4(n + 8)

(7.4.95)

Using the result η = O(2 ) and the scaling relations (7.3.41a–d), one obtains the following expressions (the diﬀerence from the result (7.4.35) of the Gaussian approximation is remarkable) 4−n + O(2 ) , 2(n + 8) 3 1 + O(2 ) , β= − 2 2(n + 8) n+2 + O(2 ) , γ =1+ 2(n + 8)

(7.4.97)

δ = 3 + + O(2 )

(7.4.99)

α=

(7.4.96)

(7.4.98)

to ﬁrst order in the expansion parameter = 4 − d. The ﬁrst non-trivial contribution to the exponent η appears in the two-loop order, η=

n+2 2 + O(3 ) . 2(n + 8)2

(7.4.100)

The universality of these results manifests itself in the fact that they depend only on the spatial dimension d and the number of components n of the order parameter, and not on the original “microscopic” Ginzburg–Landau parameters. Remarks: (i) At the upper critical dimension, dc = 4, an inverse power law is obtained as the solution of Eq. (7.4.91) instead of an exponential behavior, leading to logarithmic corrections to the mean-ﬁeld exponents. (ii) We also mention that for long-range interactions which exhibit powerlaw behavior ∝ |x|−(d+σ) , the critical exponents contain an additional dependence on the parameter σ. (iii) In addition to the -expansion, an expansion in terms of powers of 1/n is also possible. Here, the limit n → ∞ corresponds to the exactly solvable spherical model.23 This 1/n-expansion indeed helps to clarify some general aspects but its numerical accuracy is not very great, since precisely the small values of n are of practical interest. 23

Shang-Keng Ma, Modern Theory of Critical Phenomena, Benjamin, Reading, 1976.

386

7. Phase Transitions, Renormalization Group Theory, and Percolation

The diﬀerential recursion relations of the form (7.4.90) and (7.4.91) also serve as a basis for the treatment of more subtle issues such as the calculation of the scaling functions or the treatment of crossover phenomena within the framework of the RG theory. Thus, for example, an anisotropic perturbation in the n-component Heisenberg model favoring m directions leads to a crossover from the O(n)-Heisenberg ﬁxed point24 to the O(m) ﬁxed point.25 The instability of the former is described by the crossover exponent. For small anisotropic disturbances, to be sure, the ﬂow of the RG trajectory passes very close to the unstable ﬁxed point. This means that one ﬁnds the behavior of an n-component system far from the transition temperature Tc , before the system is ﬁnally dominated by the anisotropic critical behavior. The crossover from one RG ﬁxed point to another can be represented (and measured) by the introduction of eﬀective exponents. These are deﬁned as logarithmic derivatives of suitable physical quantities. Other important perturbations which were treated within the RG theory are, on the one hand, cubic terms. They reﬂect the underlying crystal structure and contribute terms of fourth order in the cartesian components of φ to the Ginzburg– Landau–Wilson functional. On the other hand, dipolar interactions lead to a perturbation which alters the harmonic part of the theory. 7.4.5.4 More Advanced Field-Theoretical Methods If one wishes to discuss perturbation theory in orders higher than the ﬁrst or second, Wilson’s momentum-shell renormalization scheme is not the best choice for practical calculations, in spite of its intuitively appealing properties. The technical reason for this is that the integrals in Fourier space involve nested momenta, which owing to the ﬁnite cutoﬀ wavelength Λ are diﬃcult to evaluate. It is then preferable to use a ﬁeld-theoretical renormalization scheme with Λ → ∞. However, this leads to additional ultraviolet (UV) divergences of the integrals for d ≥ dc . At the critical dimension dc , both ultraviolet and infrared (IR) singularities occur in combination in logarithmic form, [∝ log(Λ2 /r)]. The idea is now to treat the UV divergences with the methods originally developed in quantum ﬁeld theory and thus to arrive at the correct scaling behavior for the IR limit. In the formal implementation, one takes advantage of the fact that the original unrenormalized theory does not depend on the arbitrarily chosen renormalization point; as a consequence, one obtains the Callan–Symanzik- or RG equations. These are partial diﬀerential equations which correspond to the diﬀerential ﬂow equations in the Wilson scheme.

24

25

O(n) indicates invariance with respect to rotations in n-dimensional space, i.e. with respect to the group O(n). See D. J. Amit, Field Theory, the Renormalization Group and Critical Phenomena, 2nd ed., World Scientiﬁc, Singapore, 1984, Chap. 5–3.

∗

7.5 Percolation

387

-expansions have been carried out up to the seventh order;26 the series obtained is however only asymptotically convergent (the convergence radius of the perturbation expansion in u clearly must be zero, since u < 0 corresponds to an unstable theory). The combination of the results from expansions to such a high order with the divergent asymptotic behavior and Borel resummation techniques yields critical exponents with an impressive precision; cf. Table 7.4. Table 7.4. The best estimates for the static critical exponents ν, β, and δ, for the O(n)-symmetric φ4 model in d = 2 and d = 3 dimensions, from -expansions up to high order in connection with Borel summation techniques.26 For comparison, the exact Onsager results for the 2d-Ising model are also shown. The limiting case n = 0 describes the statistical mechanics of polymers. γ n = 0 1.39 ± 0.04 n = 1 1.73 ± 0.06 2d Ising (exact) 1.75 d=2

d=3

∗

n=0 n=1 n=2 n=3

ν

β

η

0.76 ± 0.03 0.99 ± 0.04 1

0.065 ± 0.015 0.120 ± 0.015 0.125

0.21 ± 0.05 0.26 ± 0.05 0.25

1.160 ± 0.004 0.5885 ± 0.0025 0.3025 ± 0.0025 0.031 ± 0.003 1.239 ± 0.004 0.6305 ± 0.0025 0.3265 ± 0.0025 0.037 ± 0.003 1.315 ± 0.007 0.671 ± 0.005 0.3485 ± 0.0035 0.040 ± 0.003 1.390 ± 0.010 0.710 ± 0.007 0.368 ± 0.004 0.040 ± 0.003

7.5 Percolation

Scaling theories and renormalization group theories also play an important role in other branches of physics, whenever the characteristic length tends to inﬁnity and structures occur on every length scale. Examples are percolation in the vicinity of the percolation threshold, polymers in the limit of a large number of monomers, the self-avoiding random walk, growth processes, and driven dissipative systems in the limit of slow growth rates (self-organized criticality). As an example of such a system which can be described in the language of critical phenomena, we will consider percolation. 7.5.1 The Phenomenon of Percolation The phenomenon of percolation refers to problems of the following type: (i) Consider a landscape with hills and valleys, which gradually ﬁlls up with water. When the water level is low, lakes are formed; as the level rises, some of 26

J. C. Le Guillou and J. C. Zinn-Justin, J. Phys. Lett. 46 L, 137 (1985)

388

7. Phase Transitions, Renormalization Group Theory, and Percolation

the lakes join together until ﬁnally at a certain critical level (or critical area) of the water, a sea is formed which stretches from one end of the landscape to the other, with islands. (ii) Consider a surface made of an electrical conductor in which circular holes are punched in a completely random arrangement (Fig. 7.23a). Denoting the fraction of remaining conductor area by p, we ﬁnd for p > pc that there is still an electrical connection from one end of the surface to the other, while for p < pc , the pieces of conducting area are reduced to islands and no longer form continuous bridges, so that the conductivity of this disordered medium is zero. One refers to pc as the percolation threshold. Above pc , there is an inﬁnite “cluster”; below this limit, there are only ﬁnite clusters, whose average radius however diverges on approaching pc . Examples (i) and (ii) represent continuum percolation. Theoretically, one can model such systems on a discrete d-dimensional lattice. In fact, such discrete models also occur in Nature, e.g. in alloys.

Fig. 7.23. Examples of percolation (a) A perforated conductor (Swiss cheese model): continuum percolation; (b) site percolation; (c) bond percolation

(iii) Let us imagine a square lattice in which each site is occupied with a probability p and is unoccupied with the probability (1 − p). ‘Occupied’ can mean in this case that an electrical conductor is placed there and ‘unoccupied’ implies an insulator, or that a magnetic ion or a nonmagnetic ion is present, cf. Fig. 7.23b. Staying with the ﬁrst interpretation, we ﬁnd the following situation: for small p, the conductors form only small islands (electric current can ﬂow only between neighboring sites) and the overall system is an insulator. As p increases, the islands (clusters) of conducting sites get larger. Two lattice sites belong to the same cluster when there is a connection between them via occupied nearest neighbors. For large p (p 1) there are many conducting paths between the opposite edges and the system is a good conductor. At an intermediate concentration pc , the percolation threshold or critical concentration, a connection is just formed, i.e. current can percolate

∗

7.5 Percolation

389

from one edge of the lattice to the other. The critical concentration separates the insulating phase below pc from the conducting phase above pc . In the case of the magnetic example, at pc a ferromagnet is formed from a paramagnet, presuming that the temperature is suﬃciently low. A further example is the occupation of the lattice sites by superconductors or normal conductors, in which case a transition from the normal conducting to the superconducting state takes place. We have considered here some examples of site percolation, in which the lattice sites are stochastically occupied, Fig. 7.23b. Another possibility is that bonds between the lattice sites are stochastically present or are broken. One then refers to bond percolation (cf. Fig. 7.23c). Here, clusters made up of existing bonds occur; two bonds belong to the same cluster if there is a connection between them via existing bonds. Two examples of bond percolation are: (i) a macroscopic system with percolation properties can be produced from a stochastic network of resistors and connecting wires; (ii) a lattice of branched monomers can form bonds between individual monomers with a probability p. For p < pc , ﬁnite macromolecules are formed, and for p > pc , a network of chemical bonds extends over the entire lattice. This gelation process from a solution to a gel state is called the sol-gel transition (example: cooking or “denaturing” of an egg or a pudding); see Fig. 7.23. Remarks: (i) Questions related to percolation are also of importance outside physics, e.g. in biology. An example is the spread of an epidemic or a forest ﬁre. An aﬀected individual can infect a still-healthy neighbor within a given time step, with a probability p. The individual dies after one time step, but the infected neighbors could transmit the disease to other still living, healthy neighbors. Below the critical probability pc , the epidemic dies out after a certain number of time steps; above this probability, it spreads further and further. In the case of a forest ﬁre, one can think of a lattice which is occupied by trees with a probability p. When a tree burns, it ignites the neighboring trees within one time step and is itself reduced to ashes. For small values of p, the ﬁre dies out after several time steps. For p > pc , the ﬁre spreads over the entire forest region, assuming that all the trees along one boundary were ignited. The remains consist of burned-out trees, empty lattice sites, and trees which were separated from their surroundings by a ring of empty sites so that they were never ignited. For p > pc , the burned-out trees form an inﬁnite cluster. (ii) In Nature, disordered systems often occur. Percolation is a simple example of this, in which the occupation of the individual lattice sites is uncorrelated among the sites.

As emphasized above, these models for percolation can also be introduced on a d-dimensional lattice. The higher the spatial dimension, the more possible connected paths there are between sites; therefore, the percolation threshold pc decreases with increasing spatial dimension. The percolation threshold is also smaller for bond percolation than for site percolation, since a bond has more neighboring bonds than a lattice site has neighboring lattice sites (in a square lattice, 6 instead of 4). See Table 7.5.

390

7. Phase Transitions, Renormalization Group Theory, and Percolation

Table 7.5. Percolation thresholds and critical exponents for some lattices Lattice one-dimensional square simple cubic Bethe lattice d = 6 hypercubic d = 7 hypercubic

pc site

bond

1 0.592 0.311

1 1/2 0.248

1 z−1

1 z−1

0.107 0.089

0.0942 0.0787

β

ν

–

1

1

5 36

4 3

43 18

0.417 1 1 1

0.875 1

1.795 1 1 1

1 2 1 2

γ

The percolation transition, in contrast to thermal phase transitions, has a geometric nature. When p increases towards pc , the clusters become larger and larger; at pc , an inﬁnite cluster is formed. Although this cluster already extends over the entire area, the fraction of sites which it contains is still zero at pc . For p > pc , more and more sites join the inﬁnite cluster at the expense of the ﬁnite clusters, whose average radii decrease. For p = 1, all sites naturally belong to the inﬁnite cluster. The behavior in the vicinity of pc exhibits many similarities to critical behavior in second-order phase transitions in the neighborhood of the critical temperature Tc . As discussed β in Sect. 7.1, the magnetization increases below Tc as M ∼ (Tc − T ) . In the case of percolation, the quantity corresponding to the order parameter is the probability P∞ that an occupied site (or an existing bond) belongs to the inﬁnite cluster, Fig. (7.24). Accordingly, 0 for p < pc P∞ ∝ (7.5.1) β (p − pc ) for p > pc .

Fig. 7.24. P∞ : order parameter (the strength of the inﬁnite clusters); S: average number of sites in a ﬁnite cluster

The correlation length ξ characterizes the linear dimension of the ﬁnite clusters (above and below pc ). More precisely, it is deﬁned as the average distance between two occupied lattice sites in the same ﬁnite cluster. In the vicinity

∗

7.5 Percolation

391

of pc , ξ behaves as ξ ∼ |p − pc |−ν .

(7.5.2)

A further variable is the average number of sites (bonds) in a ﬁnite cluster. It diverges as S ∼ |p − pc |−γ

(7.5.3)

and corresponds to the magnetic susceptibility χ; cf. Fig. (7.24). Just as in a thermal phase transition, one expects that the critical properties (e.g. the values of β, ν, γ) are universal, i.e. that they do not depend on the lattice structure or the kind of percolation (site, bond, continuum percolation). These critical properties do, however, depend on the spatial dimension of the system. The values of the exponents are collected in Table 7.5 for several diﬀerent lattices. One can map the percolation problem onto an s-state-Potts model, whereby the limit s → 1 is to be taken.27,28 From this relation, it is understandable that the upper critical dimension for percolation is dc = 6. The Potts model in its ﬁeld-theoretical Ginzburg–Landau formulation contains a term of the form φ3 ; from it, following considerations analogous to the φ4 theory, the characteristic dimension dc = 6 is derived. The critical exponents β, ν, γ describe the geometric properties of the percolation transition. Furthermore, there are also dynamic exponents, which describe the transport properties such as the electrical conductivity of the perforated circuit board or of the disordered resistance network. Also the magnetic thermodynamic transitions in the vicinity of the percolation threshold can be investigated. 7.5.2 Theoretical Description of Percolation We consider clusters of size s, i.e. clusters containing s sites. We denote the number of such s-clusters divided by the number of all lattice sites by ns , and call this the (normalized) cluster number. Then s ns is the probability that an arbitrarily chosen site will belong to a cluster of size s. Below the percolation threshold (p < pc ), we have ∞

s ns =

s=1

number of all the occupied sites =p. total number of lattice sites

The number of clusters per lattice site, irrespective of their size, is Nc = ns .

(7.5.4)

(7.5.5)

s 27 28

C. M. Fortuin and P. W. Kasteleyn, Physica 57, 536 (1972). The s-state-Potts model is deﬁned as a generalization of the Ising model, which corresponds to the 2-state-Potts model: at each lattice site there are s states Z. The energy contribution of a pair is −JδZ,Z , i.e. −J if both lattice sites are in the same state, and otherwise zero.

392

7. Phase Transitions, Renormalization Group Theory, and Percolation

The average size (and also the average mass) of all ﬁnite clusters is S=

∞

∞

s ns 1 2 s ∞ = s ns . p s=1 s=1 s ns s=1

(7.5.6)

The following relation holds between the quantity P∞ deﬁned before (7.5.1) and ns : we consider an arbitrary lattice site. It is either empty or occupied and belongs to a cluster of ﬁnite size, ∞ or it is occupied and belongs to the inﬁnite cluster, that is 1 = 1 − p + s=1 s ns + p P∞ , and therefore P∞ = 1 −

1 s ns . p s

(7.5.7)

7.5.3 Percolation in One Dimension We consider a one-dimensional chain in which every lattice site is occupied with the probability p. Since a single unoccupied site will interrupt the connection to the other end, i.e. an inﬁnite cluster can be present only when all sites are occupied, we have pc = 1. In this model we can thus study only the phase p < pc . We can immediately compute the normalized number of clusters ns for this model. The probability that an arbitrarily chosen site belongs to a clus2 ter of size s has the value s p s (1 − p) , since a series of s sites must be s occupied (factor p ) and the sites at the left and right boundaries must be unoccupied (factor (1 − p)2 ). Since the chosen site could be at any of the s locations within the clusters, the factor s occurs. From this and from the general considerations at the beginning of Sect. 7.5.2, it follows that: 2

ns = ps (1 − p) .

(7.5.8)

With this expression and starting from (7.5.6), we can calculate the average cluster size: 2 ∞ ∞ 2

1 2 d 1 2 s (1 − p) 2 S= p s p (1 − p) = ps s ns = p p s=1 p dp s=1 (7.5.9)

2 d 1+p (1 − p)2 p p = for p < pc . = p dp 1 − p 1−p The average cluster size diverges on approaching the percolation threshold pc = 1 as 1/(1 − p), i.e. in one dimension, the exponent introduced in (7.5.3) is γ = 1. We now deﬁne the radial correlation function g(r). Let the zero point be an occupied site; then g(r) gives the average number of occupied sites at a distance r which belong to the same cluster as the zero point. This is also equal to the probability that a particular site at the distance r is occupied and

∗

7.5 Percolation

393

belongs to the same cluster, multiplied by the number of sites at a distance r. Clearly, g(0) = 1. For a point to belong to the cluster requires that this point itself and all points lying between 0 and r be occupied, that is, the probability that the point r is occupied and belongs to the same cluster as 0 is pr , and therefore we ﬁnd g(r) = 2 pr

r≥1.

for

(7.5.10)

The factor of 2 is required because in a one-dimensional lattice there are two points at a distance r. The correlation length is deﬁned by ∞ 2 r ∞ 2 r=1 r g(r) r=1 r p ξ2 = = . (7.5.11) ∞ ∞ r r=1 g(r) r=1 p Analogously to the calculation in Eq. (7.5.9), one obtains ξ2 =

1+p 2

(1 − p)

=

1+p

,

2

(p − pc )

(7.5.11 )

i.e. here, the critical exponent of the correlation length is ν = 1. We can also write g(r) in the form g(r) = 2 er log p = 2 e−

√

2r ξ

,

(7.5.10 )

where after the last equals sign, we have taken p ≈ pc , so that log p = log(1 − (1 − p)) ≈ −(1 − p). The correlation length characterizes the (exponential) decay of the correlation function. The average cluster size previously introduced can also be represented in terms of the radial correlation function S =1+

∞

g(r) .

(7.5.12)

r=1

We recall the analogous relation between the static susceptibility and the correlation function, which was derived in the chapter on ferromagnetism, Eq. (6.5.42). One can readily convince oneself that (7.5.12) together with (7.5.10) again leads to (7.5.9). 7.5.4 The Bethe Lattice (Cayley Tree) A further exactly solvable model, which has the advantage over the onedimensional model that it is deﬁned also in the phase region p > pc , is percolation on a Bethe lattice. The Bethe lattice is constructed as follows: from the lattice site at the origin, z (coordination number) branches spread out, at whose ends again lattice sites are located, from each of which again z − 1 new branches emerge, etc. (see Fig. 7.25 for z = 3).

394

7. Phase Transitions, Renormalization Group Theory, and Percolation

Fig. 7.25. A Bethe lattice with the coordination number z = 3

The ﬁrst shell of lattice sites contains z sites, the second shell contains l−1 z(z − 1) sites, and the lth shell contains z(z − 1) sites. The number of lattice sites increases exponentially with the distance from the center point ∼ el log(z−1) , while in a d-dimensional Euclidean lattice, this number increases as ld−1 . This suggests that the critical exponents of the Bethe lattice would be the same as those of a usual Euclidean lattice for d → ∞. Another particular diﬀerence between the Bethe lattice and Euclidean lattices is the property that it contains only branches but no closed loops. This is the reason for its exact solvability. To start with, we calculate the radial correlation function g(l), which as before is deﬁned as the average number of occupied lattice sites within the same cluster at a distance l from an arbitrary occupied lattice site. The probability that a particular lattice site at the distance l is occupied as well as all those between it and the origin has the value pl . The number of all the l−1 sites in the shell l is z(z − 1) ; from this it follows that: l−1

g(l) = z(z − 1)

pl =

z z l log(p(z−1)) (p(z − 1))l = e . z−1 z−1

(7.5.13)

From the behavior of the correlation function for large l, one can read oﬀ the percolation threshold for the Bethe lattice. For p(z − 1) < 1, there is an exponential decrease, and for p(z − 1) > 1, g(l) diverges for l → ∞ and there is an inﬁnite cluster, which must not be included in calculating the correlation function of the ﬁnite clusters. It follows from (7.5.13) for pc that pc =

1 . z−1

(7.5.14)

For z = 2, the Bethe lattice becomes a one-dimensional chain, and thus pc = 1. From (7.5.13) it is evident that the correlation length is ξ∝

−1 −1 1 = ∼ log [p(z − 1)] log ppc pc − p

(7.5.15)

∗

7.5 Percolation

395

for p in the vicinity of pc , i.e. ν = 1, as in one dimension29 . The same result is found if one deﬁnes ξ by means of (7.5.11). For the average cluster size one ﬁnds for p < pc S =1+

∞

g(l) =

l=1

pc (1 + p) pc − p

for p < pc ;

(7.5.16)

i.e. γ = 1. The strength of the inﬁnite cluster P∞ , i.e. the probability that an arbitrary occupied lattice site belongs to the inﬁnite cluster, can be calculated in the following manner: the product pP∞ is the probability that the origin or some other point is occupied and that a connection between occupied sites up to inﬁnity exists. We ﬁrst compute the probability Q that an arbitrary site is not connected to inﬁnity via a particular branch originating from it. This is equal to the probability that the site at the end of the branch is not occupied, that is (1 − p) plus the probability that this site is occupied but that none of the z − 1 branches which lead out from it connects to ∞, i.e. Q = 1 − p + p Qz−1 . This is a determining equation for Q, which we shall solve for simplicity for a coordination number z = 3. The two solutions of the quadratic equation are Q = 1 and Q = 1−p p . The probability that the origin is occupied, that however no path leads to inﬁnity, is on the one hand p(1 − P∞ ) and on the other p Qz , i.e. for z = 3: P∞ = 1 − Q3 . For the ﬁrst solution, Q = 1, we obtain P∞ = 0, obviously relevant for p < pc ; and for the second solution 3

1−p P∞ = 1 − , (7.5.17) p for p > pc . In the vicinity of pc = 12 , the strength of the inﬁnite clusters varies as P∞ ∝ (p − pc ) ,

(7.5.18)

that is β = 1. We will also obtain this result with Eq. (7.5.30) in a diﬀerent manner. 29

Earlier, it was speculated that hypercubic lattices of high spatial dimension have the same critical exponents as the Bethe lattice. The visible diﬀerence in ν seen in Table 7.5 is due to the fact that in the Bethe lattice, the topological (chemical) and in the hypercubic lattice the Euclidean distance was used. If one uses the chemical distance for the hypercubic lattice also, above d = 6, ν = 1 is likewise obtained. See Literature: A. Bunde and S. Havlin, p. 71.

396

7. Phase Transitions, Renormalization Group Theory, and Percolation

Now we will investigate the normalized cluster number ns , which is also equal to the probability that a particular site belongs to a cluster of size s, divided by s. In one dimension, ns could readily be determined. In general, the probability for a cluster with s sites and t (empty) boundary points is t ps (1 − p) . The perimeter t includes external and internal boundary points of the cluster. For general lattices, such as e.g. the square lattice, there are various values of t belonging to one and the same value of s, depending on the shape of the cluster; the more stretched out the cluster, the larger is t, and the more nearly spherical the cluster, the smaller is t. In a square lattice, there are two clusters having the size 3, a linear and a bent cluster. The associated values of t are 8 and 7, and the number of orientations on the lattice are 2 and 4. For general lattices, the quantity gst must therefore be introduced; it gives the number of clusters of size s and boundary t. Then the general expression for ns is t gst ps (1 − p) . (7.5.19) ns = t

For arbitrary lattices, a determination of gst is in general not possible. For the Bethe lattice, there is however a unique connection between the size s of the cluster and the number of its boundary points t. A cluster of size 1 has t = z, and a cluster of s = 2 has t = 2z − 2. In general, a cluster of size s has z − 2 more boundary points than a cluster of size s − 1, i.e. t(s) = z + (s − 1)(z − 2) = 2 + s(z − 2) . Thus, for the Bethe lattice, 2+(z−2)s ns = gs ps 1 − p ,

(7.5.20)

where gs is the number of conﬁgurations of clusters of the size s. In order to avoid the calculation of gs , we will refer ns (p) to the distribution ns (pc ) at pc . We now wish to investigate the behavior of ns in the vicinity of pc = −1 (z − 1) as a function of the cluster size, and separate oﬀ the distribution at pc , z−2 s

2 p (1 − p) 1−p ns (p) = ns (pc ) ; (7.5.21) 1 − pc pc (1 − pc ) we then expand around p = pc 2 (p − pc )2 1−p + O (p − pc )3 1− ns (p) = ns (pc ) 2 1 − pc 2 pc (1 − pc ) −c s

s

(7.5.22)

= ns (pc ) e , 2 2 c) ∝ (p − pc ) . with c = − log 1 − 2p(p−p c (1−pc ) This means that the number of clusters of size s decreases exponentially. 1 The second factor in (7.5.22) depends only on the combination (p − pc ) σ s,

∗

7.5 Percolation

397

with σ = 1/2. The exponent σ determines how rapidly the number of clusters decreases with increasing size s. At pc , the s-dependence of ns arises only from the prefactor ns (pc ). In analogy to critical points, we assume that ns (pc ) is a pure power law; in the case that ξ gives the only length scale, which is inﬁnite at pc ; then at pc there can be no characteristic lengths, cluster sizes, etc. That is, ns (pc ) can have only the form ns (pc ) ∼ s−τ .

(7.5.23)

The complete function (7.5.22) is then of the form 1 ns (p) = s−τ f (p − pc ) σ s ,

(7.5.24)

and it is a homogeneous function of s and (p−pc ). We can relate the exponent τ to already known exponents: the average cluster size is, from Eq. (7.5.6), 1 2 s ns (p) ∝ s2−τ e−cs p s ∞ ∞ 2−τ −cs τ −3 ∝ ds s e =c z 2−τ e−z dz .

S=

1

(7.5.25)

c

For τ < 3, the integral exists, even when its lower limit goes to zero: it is then S ∼ cτ −3 = (p − pc )

τ −3 σ

,

(7.5.26)

from which, according to (7.5.3), it follows that γ=

3−τ . σ

(7.5.27)

Since for the Bethe lattice, γ = 1 and σ = 12 , we ﬁnd τ = 52 . From (7.5.24) using the general relation (7.5.7) one can also determine P∞ . While the factor s2 in (7.5.25) was suﬃcient to make the integral converge at its lower limit, this is not the case in (7.5.7). Therefore, we ﬁrst write (7.5.7) in the form 1 s ns (p) − ns (pc ) − p s 1 = s ns (pc ) − ns (p) + 1 − p s

P∞ = 1 −

where P∞ (pc ) = 0 = 1 −

1 s ns (pc ) pc s

1 s ns (pc ) p s pc , p

(7.5.28)

398

7. Phase Transitions, Renormalization Group Theory, and Percolation

has been used. Now the ﬁrst term in (7.5.28) can be replaced by an integral ∞

p − pc P∞ = const. cτ −2 z 1−τ 1 − e−z dz + p c (7.5.29) p − pc τ −2 = ...c + . p From this, we ﬁnd for the exponent deﬁned in Eq. (7.5.1) β=

τ −2 . σ

(7.5.30)

For the Bethe lattice, one ﬁnds once again β = 1, in agreement with (7.5.18). In the Bethe lattice, the ﬁrst term in (7.5.29)) also has the form p − pc , while in other lattices, the ﬁrst term, (p − pc )β , predominates relative to the second due to β < 1. In (7.5.5), we also introduced the average number of clusters per lattice site, whose critical percolation behavior is characterized by an exponent α via Nc ≡ ns ∼ |p − pc |2−α . (7.5.31) s

That is, this quantity plays an analogous role to that of the free energy in thermal phase transitions. We note that in the case of percolation there are no interactions, and the free energy is determined merely by the entropy. Again inserting (7.5.24) for the cluster number into (7.5.31), we ﬁnd 2−α=

τ −1 , σ

(7.5.32)

which leads to α = −1 for the Bethe lattice. In summary, the critical exponents for the Bethe lattice are β = 1 , γ = 1 , α = −1 , ν = 1 , τ = 5/2 , σ = 1/2 .

(7.5.33)

7.5.5 General Scaling Theory In the preceding section, the exponents for the Bethe lattice (Cayley tree) were calculated. In the process, we made some use of a scaling assumption (7.5.24). We will now generalize that assumption and derive the consequences which follow from it. We start with the general scaling hypothesis 1 ns (p) = s−τ f± |p − pc | σ s , (7.5.34)

∗

7.5 Percolation

399

where ± refers to p ≷ pc .30 The relations (7.5.27), (7.5.30), and (7.5.32), which contain only the exponents α, β, γ, σ, τ , also hold for the general scaling hypothesis. The scaling relation for the correlation length and other characteristics of the extension of the ﬁnite clusters must be derived once more. The correlation length is the root mean square distance between all the occupied sites within the same ﬁnite cluster. For a cluster with s occupied sites, the root mean square distance between all pairs is Rs2 =

i s 1 (xi − xj )2 . s2 i=1 j=1

The correlation length ξ is obtained by averaging over all clusters ∞ 2 2 2 s=1 Rs s ns ξ = . ∞ 2 s=1 s ns

(7.5.35)

The quantity 12 s2 ns is equal to the number of pairs in clusters ns of size s, i.e. proportional to the probability that a pair (in the same cluster) belongs to a cluster of the size s. The mean square cluster radius is given by ∞ 2 s=1 Rs s ns 2 R = , (7.5.36) ∞ s=1 s ns since s ns = the probability that an occupied site belongs to an s-cluster. The mean square distance increases with cluster size according to Rs ∼ s1/df ,

(7.5.37)

where df is the fractal dimension. Then it follows from (7.5.35) that ξ2 ∼

∞

2

s df

+2−τ

∞ ; 1 1 f± |p − pc | σ s s2−τ f± |p − pc | σ s

s=1

s=1 − d 2σ

∼ |p − pc | f , 2 < τ < 2.5 τ −1 1 = , ν= df σ dσ and from (7.5.36), R2 ∼

∞

2

s df

+1−τ

1

−2ν+β

f± (|p − pc | σ s) ∼ |p − pc |

.

s=1 30

At the percolation threshold p = pc , the distribution of clusters is a power law ns (pc ) = s−τ f± (0). The cutoﬀ function f± (x) goes to zero for x 1, for example as in (7.5.22) exponentially. The quantity smax = |p−pc |−1/σ refers to the largest cluster. Clusters of size s smax are also distributed according to s−τ for p = pc , and for s smax , ns (p) vanishes.

400

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.5.5.1 The Duality Transformation and the Percolation Threshold The computation of pc for bond percolation on a square lattice can be carried out by making use of a duality transformation. The deﬁnition of the dual lattice is illustrated in Fig. 7.26. The lattice points of the dual lattice are deﬁned by the centers of the unit cells of the lattice. A bond in the dual lattice is placed wherever it does not cross a bond of the lattice; i.e. the probability for a bond in the dual lattice is q =1−p. In the dual lattice, there is likewise a bond percolation problem. For p < pc , there is no inﬁnite cluster on the lattice, however there is an inﬁnite cluster on the dual lattice. There is a path from one end of the dual lattice to the other which cuts no bonds on the lattice; thus q > pc . For p → p− c from below, q → p+ arrives at the percolation threshold from above, i.e. c pc = 1 − pc . Thus, pc = 12 . This result is exact for bond percolation.

Fig. 7.26. A lattice and its dual lattice. Left side: A lattice with bonds and the dual lattice. Right side: Showing also the bonds in the dual lattice

Remarks: (i) By means of similar considerations, one ﬁnds also that the percolation threshold for site percolation on a triangular lattice is given by pc = 12 . (ii) For the two-dimensional Ising model, also, the transition temperatures for a series of lattice structures were already known from duality transformations before its exact solution had been achieved.

7.5.6 Real-Space Renormalization Group Theory We now discuss a real-space renormalization-group transformation, which allows the approximate determination of pc and the critical exponents. In the decimation transformation shown in Fig. 7.27 for a square lattice, every other lattice site is eliminated; this leads again to a square lattice. In

∗

7.5 Percolation

401

Fig. 7.27. A lattice and a decimated lattice

Fig. 7.28. Bond conﬁgurations which lead to a bond (dashed) on the decimated lattice

the new lattice, a bond is placed between two remaining sites if at least one connection via two bonds existed on the original lattice (see Fig. 7.27). The bond conﬁgurations which lead to formation of a bond (shown as dashed lines) in the decimated lattice are indicated in Fig. 7.28. Below, the probability for these conﬁgurations is given. From the rules shown in Fig. 7.28, we ﬁnd for the probability for the existence of a bond on the decimated lattice p = p4 + 4p3 (1 − p) + 2p2 (1 − p) = 2p2 − p4 . 2

(7.5.38)

From this transformation law31 , one obtains the ﬁxed-point equation p∗ = 2p∗ 2 − p∗ 4 . It has the solutions p∗ = 0 , p∗ = 1, which correspond to the highand low-temperature ﬁxed points for phase transitions; and in addition, the + √5 √ −1(−) two ﬁxed points p∗ = , of which only p∗ = 5−1 = 0.618 . . . is 2 2 acceptable. This value of the percolation threshold diﬀers from the exact value found in the preceding section, 12 . The reasons for this are: (i) sites which were connected on the original lattice may not be connected on the decimated lattice; (ii) diﬀerent bonds on the decimated lattice are no longer uncorrelated, since the existence of a bond on the original lattice can be responsible for the occurrence of several bonds on the decimated lattice. The linearization of the recursion relation around the ﬁxed point yields ν = 0.817 for the exponent of the correlation length. The treatment of site percolation on a triangular lattice in two dimensions is most simple. The lattice points of a triangle are combined into a cell. This cell is counted as occupied if all three sites are occupied, or if two sites are occupied and one is empty, since in both cases there is a path through the cell. For all other conﬁgurations (only one site occupied or none occupied), the cell is unoccupied. For the 31

A. P. Young and R. B. Stinchcombe, J. Phys. C: Solid State Phys. 8, L 535 (1975).

402

7. Phase Transitions, Renormalization Group Theory, and Percolation

triangular lattice32 , one thus obtains as the recursion relation p = p3 + 3p2 (1 − p) , ∗

(7.5.39) 1 . 2

This RG transformation thus yields pc = 12 for with the ﬁxed points p = 0, 1, the percolation threshold, which is identical with the exact value (see remark (i) above). The linearization of the RG transformation around the ﬁxed point yields the following result for the exponent ν of the correlation length: √ log 3 = 1.3547 . ν= log 32 This is nearer to the result obtained by series expansion, ν = 1.34, as well as to the exact result, 4/3, than the result for the square lattice (see the remark on universality following Eq. (7.5.3)).

7.5.6.1 Deﬁnition of the Fractal Dimension In a fractal object, the mass behaves as a function of the length L of a d-dimensional Euclidean section as M (L) ∼ Ldf , and thus the density is ρ(L) =

M (L) ∼ Ldf −d . Ld

An alternative deﬁnition of df is obtained from the number of hypercubes N (Lm , δ) which one requires to cover the fractal structure. We take the side length of the hypercubes to be δ, and the hypercube which contains the whole cluster to have the side length Lm : «d „ Lm f N (Lm , δ) = , δ i.e. df = − lim

δ→0

log N (Lm , δ) . log δ

Literature D. J. Amit, Field Theory, the Renormalization Group, and Critical Phenomena, 2nd ed., World Scientiﬁc, Singapore 1984 P. Bak, C. Tang, and K. Wiesenfeld, Phys. Rev. Lett. 59, 381 (1987) K. Binder, Rep. Progr. Phys. 60, 487 (1997) 32

P. J. Reynolds, W. Klein, and H. E. Stanley, J. Phys. C: Solid State Phys. 10 L167 (1977).

∗

7.5 Percolation

403

J. J. Binney, N. J. Dowrick, A. J. Fisher, and M. E. J. Newman, The Theory of Critical Phenomena, 2nd ed., Oxford University Press, New York 1993 M. J. Buckingham and W. M. Fairbank, in: C. J. Gorter (Ed.), Progress in Low Temperature Physics, Vol. III, 80–112, North Holland Publishing Company, Amsterdam 1961 A. Bunde and S. Havlin, in: A. Bunde, S. Havlin (Eds.), Fractals and Disordered Systems, 51, Springer, Berlin 1991 Critical Phenomena, Lecture Notes in Physics 54, Ed. J. Brey and R. B. Jones, Springer, Sitges, Barcelona 1976 M. C. Cross and P. C. Hohenberg (1994), Rev. Mod. Phys. 65, 851–1112 P. G. De Gennes, Scaling Concepts in Polymer Physics, Cornell University Press, Ithaca, NY 1979 C. Domb and M. S. Green, Phase Transitions and Critical Phenomena, Academic Press, London 1972-1976 C. Domb and J. L. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, Vols. 7–15, Academic Press, London 1983–1992 B. Drossel and F. Schwabl, Phys. Rev. Lett. 69, 1629 (1992) J. W. Essam, Rep. Prog. Phys. 43, 843 (1980) R. A. Ferrell, N. Menyh´ ard, H. Schmidt, F. Schwabl, and P. Sz´epfalusy, Ann. Phys. (New York) 47, 565 (1968) M. E. Fisher, Rep. Prog. Phys. 30, 615–730 (1967) M. E. Fisher, Rev. Mod. Phys. 46, 597 (1974) E. Frey and F. Schwabl, Adv. Phys. 43, 577-683 (1994) B. I. Halperin and P. C. Hohenberg, Phys. Rev. 177, 952 (1969) H. J. Jensen, Self-Organized Criticality, Cambridge University Press, Cambridge 1998 Shang-Keng Ma, Modern Theory of Critical Phenomena, Benjamin, Reading, Mass. 1976 S. Ma, in: C. Domb and M. S. Green (Eds.), Phase Transitions and Critical Phenomena, Vol. 6, 249–292, Academic Press, London 1976 T. Niemeijer and J. M. J. van Leeuwen, in: C. Domb and M. S. Green (Eds.), Phase Transitions and Critical Phenomena, Vol. 6, 425–505, Academic Press, London 1976 G. Parisi, Statistical Field Theory, Addison–Wesley, Redwood 1988 A. Z. Patashinskii and V. L. Prokovskii, Fluctuation theory of Phase Transitions, Pergamon Press, Oxford 1979 P. Pfeuty and G. Toulouse, Introduction to the Renormalization Group and to Critical Phenomena, John Wiley, London 1977 C. N. R. Rao and K. J. Rao, Phase Transitions in Solids, McGraw Hill, New York 1978 F. Schwabl and U. C. T¨ auber, Phase Transitions: Renormalization and Scaling, in Encyclopedia of Applied Physics, Vol. 13, 343, VCH (1995) H. E. Stanley, Introduction to Phase Transitions and Critical Phenomena, Clarendon Press, Oxford 1971 D. Stauﬀer and A. Aharony, Introduction to Percolation Theory, Taylor and Francis, London and Philadelphia 1985

404

7. Phase Transitions, Renormalization Group Theory, and Percolation

J. M. J. van Leeuwen in Fundamental Problems in Statistical Mechanics III, Ed. E. G. D. Cohen, North Holland Publishing Company, Amsterdam 1975 K. G. Wilson and J. Kogut, Phys. Rept. 12C, 76 (1974) K. G. Wilson, Rev. Mod. Phys. 47, 773 (1975) J. Zinn-Justin, Quantum Field Theory and Critical Phenomena, 3rd edition, Clarendon Press, Oxford 1996

Problems for Chapter 7 7.1 A generalized homogeneous function fulﬁlls the relation f (λa1 x1 , λa2 x2 ) = λaf f (x1 , x2 ) . j

k

∂ Show that (a) the partial derivatives ∂ j ∂x k f (x1 , x2 ) and (b) the Fourier trans∂x1 2 R d ik1 x1 f (x1 , x2 ) of a generalized homogeneous function are form g(k1 , x2 ) = d x1 e likewise homogeneous functions.

7.2 Derive the relations (7.3.13 ) for A , K , L , and M . Include in the starting model in addition an interaction between the second-nearest neighbors L. Compute the recursion relation to leading order in K and L, i.e. up to K 2 and L. Show that (7.3.15a,b) results.

7.3 What is the value of δ for the two-dimensional decimation transformation from Sect. 7.3.3? 7.4 Show, by Fourier transformation of the susceptibility χ(q) =

1 q 2−η

χ(qξ), ˆ that

the correlation function assumes the form 1 ˆ G(|x|/ξ) . G(x) = |x|d−2+η

7.5 Conﬁrm Eq.(7.4.35). 7.6 Show that m(x) = m0 tanh

x − x0 2ξ−

is a solution of the Ginzburg–Landau equation (7.4.11). Calculate the free energy of the domain walls which it describes.

7.7 Tricritical phase transition point. A tricritical phase transition point is described by the following Ginzburg–Landau functional: Z ˘ ¯ F[φ] = dd x c(∇φ)2 + aφ2 + vφ6 − hφ with a = a τ ,

τ=

T − Tc , Tc

v≥0.

Determine the uniform stationary solution φst with the aid of the variational deriva= 0) for h = 0 and the associated tricritical exponents αt , βt , γt and δt . tive ( δF δφ

Problems for Chapter 7

405

7.8 Consider the extended Ginzburg–Landau functional Z

F[φ] =

˘ ¯ dd x c(∇φ)2 + aφ2 + uφ4 + vφ6 − hφ .

(a) Determine the critical exponents α, β, γ and δ for u > 0 in analogy to problem 7.7. They take on the same values as in the φ4 model (see Sect. 4.6); the term ∼ φ6 is irrelevant, i.e. it yields only corrections to the scaling behavior of the φ4 model. Investigate the “crossover” of the tricritical behavior for h = 0 at small u. Consider the crossover function m(x), ˜ which is deﬁned as follows: u . ˜ with φt (τ ) = φeq (u = 0, τ ) ∼ τ βt , x = p φeq (u, τ ) = φt (τ ) · m(x) 3|a|v (b) Now investigate the case u < 0, h = 0. Here, a ﬁrst-order phase transition occurs; at Tc , the absolute minimum of F changes from φ = 0 to φ = φ0 . Calculate the shift of the transition temperature Tc − T0 and the height of the jump in the order parameter φ0 . Critical exponents can also be deﬁned for the approach to the tricritical point by variation of u φ0 ∼ |u|βu ,

1

Tc − T0 ∼ |u| ψ .

Give expressions for βu and the “shift exponent” ψ. (c) Calculate the second-order phase transition lines for u < 0 and h = 0 by deriving a parameter representation from the conditions ∂3F ∂2F =0= . 2 ∂φ ∂φ3 (d) Show that the free energy in the vicinity of the tricritical point obeys a generalized scaling law “ u h ” , F[φeq ] = |τ |2−αt fˆ |τ |φt |τ |δt by inserting the crossover function found in (a) into F (φt is called the “crossover exponent”). Show that the scaling relations γ δ = 1 + , α + 2β + γ = 2 β are obeyed in (a) and at the tricritical point (problem 7.7). (e) Discuss the hysteresis behavior for a ﬁrst-order phase transition (u < 0).

7.9 In the Ginzburg–Landau approximation, the spin-spin correlation function is given by ¸ ˙ 1 1 1 X ik(x−x ) ; ξ ∝ (T − Tc )− 2 . e m(x)m(x ) = d L 2βc(ξ −2 + k2 ) |k|≤Λ

(a) Replace the sum by an integral. (b) show that in the limit ξ → ∞, the following relation holds: ¸ ˙ 1 . m(x)m(x ) ∝ |x − x |d−2 (c) Show that for d = 3 and large ξ, ¸ ˙ m(x)m(x ) = holds.

1 e−|x−x |/ξ 8πcβ |x − x |

406

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.10 Investigate the behavior of the following integral in the limit ξ → ∞: Z

Λξ

I= 0

dd q ξ 4−d , d (2π) (1 + q 2 )2

by demonstrating that: (a) I ∝ ξ 4−d , d < 4 ; (b) I ∝ ln ξ , d = 4 ; (c) I ∝ A − Bξ 4−d , d > 4 .

7.11 The phase transition of a molecular zipper from C. Kittel, American Journal of Physics 37, 917, (1969). A greatly simpliﬁed model of the helix-coil transition in polypeptides or DNA, which describes the transition between hydrogen-bond stabilized helices and a molecular coil, is the “molecular zipper”. A molecular zipper consists of N bonds which can be broken from only one direction. It requires an energy to break bond p + 1 if all the bonds 1, . . . , p are broken, but an inﬁnite energy if the preceding bonds are not all broken. A broken bond is taken to have G orientations, i.e. its state is G−fold degenerate. The zipper is open when all N − 1 bonds are broken.

(a) Determine the partition function Z=

1 − xN ; 1−x

x ≡ G exp(−β) .

(b) Determine the average number s of broken bonds. Investigate s in the vicinity of xc = 1. Which value does s assume at xc , and what is the slope there? How does s behave at x 1 and x 1? (c) What would be the partition function if the zipper could be opened from both ends?

7.12 Fluctuations in the Gaussian approximation below Tc . Expand the Ginzburg–Landau functional Z h i b F[m] = dd x am(x)2 + m(x)4 + c(∇m(x))2 − hm(x) , 2 which is O(n)-symmetrical for h = 0, up to second order in terms of the ﬂuctuations of the order parameter m (x). Below Tc , ´ ` m(x) = m1 e1 + m (x) , h = 2 a + bm21 m1 holds. (a) Show that for h → 0, the long-wavelength (k → 0) transverse ﬂuctuations mi (i = 2, . . . , n) require no “excitation energy” (Goldstone modes), and determine the Gibbs free energy. In which cases do singularities occur?

Problems for Chapter 7

407

(b) What is the expression for the speciﬁc heat ch=0 below Tc in the harmonic approximation? Compare it with the result for the disordered phase. (c) Calculate the longitudinal and transverse correlation functions relative to the spontaneous magnetization m1 ˙ ¸ G (x − x ) = m1 (x)m1 (x ) and ˙ ¸ G⊥ ij (x − x ) = mi (x)mj (x ) , i, j = 2. . . . , n for d = 3 from its Fourier transform in the harmonic approximation. Discuss in particular the limiting case h → 0.

7.13 The longitudinal correlation function below Tc . The results from problem 7.12 lead us to expect that taking into account the transverse ﬂuctuations just in a harmonic approximation will in general be insuﬃcient. Anharmonic contributions can be incorporated if we ﬁx the length of the vector m(x) (h = 0), as in the underlying Heisenberg model: m1 (x)2 +

n X

mi (x)2 = m20 = const.

i=2

Compute the Fourier transform G (k), by factorizing the four-spin correlation function in a suitable manner into two-spin correlation functions G (x − x ) =

n 1 X mi (x)2 mj (x )2 2 4m0 i,j=2

and inserting G⊥ (x − x ) =

Z

dd k eik(x−x ) . (2π)d 2βck2

Remark: for n ≥ 2 and 2 < d ≤ 4, the relations G⊥ (k) ∝ fulﬁlled exactly in the limit k → 0.

1 k2

and G ∝

1 k4−d

are

7.14 Verify the second line in Eq. (7.5.22) . 7.15 The Hubbard–Stratonovich transformation: using the identity j X ﬀ Z exp − Jij Si Sj = const. i,j

∞ −∞

“Y i

ﬀ j ” 1X −1 dmi exp − mi Jij mj , 4 i,j

show that the partition function of the Ising Hamiltonian H = written in the form Z ∞ “Y ” ´¯ ˘ ` Z = const. dmi exp H {mi } . −∞

P i,j

Jij Si Sj can be

i

Give the expansion of H in terms of mi up to the order O(m4 ). Caveat: the Ising Hamiltonian must be extended by terms with Jii so that the matrix Jij is positive deﬁnite.

408

7. Phase Transitions, Renormalization Group Theory, and Percolation

7.16 Lattice-gas model. The partition function of a classical gas is to be mapped onto that of an Ising magnet. Method: the d-dimensional conﬁguration space is divided up into N cells. In each cell, there is at most one atom (hard core volume). One can imagine a lattice in which a cell is represented by a lattice site which is either empty or occupied (ni = 0 or 1). The attractive interaction U (xi − xj ) between two atoms is to be taken into account in the energy by the term 12 U2 (i, j)ni nj . (a) The grand partition function for this problem, after integrating out the kinetic energy, is given by ZG =

„Y N X « i=1 ni =0,1

h ` X ´i 1X exp −β −¯ µ ni + U2 (i, j)ni nj . 2 ij i

µ ¯ = kT log zv0 = µ − kT log

“ λd ” v0

,

z=

where v0 is the volume of a cell. (b) By introducing spin variables Si (ni = partition function into the form ZG =

„Y N

X

i=1 Si =−1,1

«

eβµ , λd

1 (1 2

λ= √

2π , 2πmkT

+ Si ), Si = ±1), bring the grand

h ` X X ´i exp −β E0 − hSi − Jij Si Sj . i

ij

Calculate the relations between E0 , h, J and µ, U2 , v0 . (c) Determine the vapor-pressure curve of the gas from the phase-boundary curve h = 0 of the ferromagnet. (d) Compute the particle-density correlation function for a lattice gas.

7.17 Demonstrate Eq. (7.4.63) using scaling relations. 7.18 Show that from (7.4.68) in the limit of small k and for h = 0, the longitudinal correlation function G (k) ∝

1 kd−2

follows.

7.19 Shift of Tc in the Ginzburg–Landau Theory. Start from Eq. (7.4.1) and use the so called quasi harmonic approximation in the paramagnetic phase. There the third (nonlinear) term in (7.4.1) is replaced by 6b < m(x)2 > m(x). (a) Justify this approximation. (b) Compute the transition temperature Tc and show that Tc < Tc0 . 7.20 Determine the ﬁxed points of the transformation equation (7.5.38).

8. Brownian Motion, Equations of Motion, and the Fokker–Planck Equations

The chapters which follow deal with nonequilibrium processes. First, in chapter 8, we treat the topic of the Langevin equations and the related Fokker– Planck equations. In the next chapter, the Boltzmann equation is discussed; it is fundamental for dealing with the dynamics of dilute gases and also for transport phenemona in condensed matter. In the ﬁnal chapter, we take up general problems of irreversibility and the transition to equilibrium.

8.1 Langevin Equations 8.1.1 The Free Langevin Equation 8.1.1.1 Brownian Motion A variety of situations occur in Nature in which one is not interested in the complete dynamics of a many-body system, but instead only in a subset of particular variables. The remaining variables lead through their equations of motion to relatively rapidly varying stochastic forces and to damping eﬀects. Examples are the Brownian motion of a massive particle in a liquid, the equations of motion of conserved densities, and the dynamics of the order parameter in the vicinity of a critical point. We begin by discussing the Brownian motion as a basic example of a stochastisic process. A heavy particle of mass m and velocity v is supposed to be moving in a liquid consisting of light particles. This “Brownian particle” is subject to random collisions with the molecules of the liquid (Fig. 8.1). The collisions with the molecules of the liquid give rise to an average frictional force on the massive particle, a stochastic force f (t), which ﬂuctuates around its average value as shown in Fig. 8.2. The ﬁrst contribution −mζv to this force will be characterized by a coeﬃcient of friction ζ. Under these physical conditions, the Newtonian equation of motion thus becomes the so called Langevin equation: mv˙ = −mζv + f (t).

(8.1.1)

410

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

Fig. 8.1. The Brownian motion

Fig. 8.2. Stochastic forces in Brownian motion

Such equations are referred to as stochastic equations of motion and the processes they describe as stochastic processes.1 The correlation time τc denotes the time during which the ﬂuctuations of the stochastic force remain correlated2 . From this, we assume that the average force and its autocorrelation function have the following form at diﬀering times3 f (t) = 0 f (t)f (t ) = φ(t − t ) .

(8.1.2)

Here, φ(τ ) diﬀers noticeably from zero only for τ < τc (Fig. 8.3). Since we are interested in the motion of our Brownian particle over times of order t which are considerably longer than τc , we can approximate φ(τ ) by a delta function φ(τ ) = λδ(τ ) .

(8.1.3)

The coeﬃcient λ is a measure of the strength of the mean square deviation of the stochastic force. Since friction also increases proportionally to the strength of the collisions, there must be a connection between λ and the coeﬃcient of friction ζ. In order to ﬁnd this connection, we ﬁrst solve the Langevin equation (8.1.1).

1

2

3

Due to the stochastic force in Eq. (8.1.1), the velocity is also a stochastic quantity, i.e. a random variable. Under the precondition that the collisions of the liquid molecules with the Brownian particle are completely uncorrelated, the correlation time is roughly equal −6 = to the duration of a collision. For this time, we obtain τc ≈ av¯ = 10105 cmcm /sec 10−11 sec, where a is the radius of the massive particle and v¯ the average velocity of the molecules of the medium. The mean value can be understood either as an average over independent Brownian particles or as an average over time for a single Brownian particle. In order to ﬁx the higher moments of f (t), we will later assume that f (t) follows a Gaussian distribution, Eq. (8.1.26).

8.1 Langevin Equations

411

Fig. 8.3. The correlation of the stochastic forces

8.1.1.2 The Einstein Relation The equation of motion (8.1.1) can be solved with the help of the retarded Green’s function G(t), which is deﬁned by G(t) = Θ(t)e−ζt .

G˙ + ζG = δ(t) ,

(8.1.4)

Letting v0 be the initial value of the velocity, one obtains for v(t) ∞ −ζt v(t) = v0 e + dτ G(t − τ )f (τ )/m 0 t −ζt −ζt +e dτ eζτ f (τ )/m . = v0 e

(8.1.5)

0

Since the dependence of f (τ ) is known only statistically,we do not consider the average value of v(t), but instead that of its square, v(t)2 v(t)2 = e−2ζt

t

dτ 0

t

dτ eζ(τ +τ ) φ(τ − τ )

0

1 + v02 e−2ζt ; m2

here, the cross term vanishes. With Eq. (8.1.3), we obtain v(t)2 =

λ (1 − e−2ζt ) + v02 e−2ζt 2ζm2

t ζ −1

−→

λ . 2ζm2

(8.1.6)

For t ζ −1 , the contribution of v0 becomes negligible and the memory of the initial value is lost. Hence ζ −1 plays the role of a relaxation time. We require that our particle attain thermal equilibrium after long times, t ζ −1 , i.e. that the average value of the kinetic energy obey the equipartition theorem 1 1 m v(t)2 = kT . 2 2

(8.1.7)

412

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

From this, we ﬁnd the so called Einstein relation λ = 2ζmkT .

(8.1.8)

The coeﬃcient of friction ζ is proportional to the mean square deviation λ of the stochastic force. 8.1.1.3 The Velocity Correlation Function Next, we compute the velocity correlation function:

v(t)v(t ) = e−ζ(t+t )

t

t

dτ

dτ eζ(τ +τ

0

0

)

λ δ(τ −τ )+v02 e−ζ(t+t ) . (8.1.9) m2

Since the roles of t and t are arbitrarily interchangeable, we can assume without loss of generality that t < t and immediately evaluate the two λintegrals in the order given in this equation, with the result e2ζ min(t,t ) − 1 2ζm 2 , thus obtaining ﬁnally

λ −ζ|t−t | λ 2 v(t)v(t ) = e−ζ(t+t ) . e + v0 − (8.1.10) 2ζm2 2ζm2 For t, t ζ −1 , the second term in (8.1.10) can be neglected. 8.1.1.4 The Mean Square Deviation In order to obtain the mean square displacement for t ζ −1 , we need only integrate (8.1.10) twice, t t λ −ζ|τ −τ | x(t)2 = dτ dτ e . (8.1.11) 2ζm2 0 0 Intermediate calculation for integrals of the type Z t Z t I= dτ dτ f (τ − τ ) . 0

0

We denote the parent function of f (τ ) by F (τ ) and evaluate the integral over τ , Rt I = 0 dτ (F (t − τ ) − F (−τ )). Now we substitute u = t − τ into the ﬁrst term and obtain after integrating by parts Z t Z t du (F (u) − F (−u)) = t(F (t) − F (−t)) − du u(f (u) + f (−u)) I= 0

0

and from this the ﬁnal result Z t Z t Z t dτ dτ f (τ − τ ) = du (t − u)(f (u) + f (−u)) . 0

0

0

(8.1.12)

8.1 Langevin Equations

413

With Eq. (8.1.12), it follows for (8.1.11) that t 2 λ λ 2 du (t − u)e−ζu ≈ 2 2 t x (t) = 2 2ζm ζ m 0 or 2 x (t) = 2Dt

(8.1.13)

with the diﬀusion constant D=

λ kT . = 2ζ 2 m2 ζm

(8.1.14)

It can be seen that D plays the role of a diﬀusion constant by starting from the equation of continuity for the particle density n(x) ˙ + ∇j(x) = 0

(8.1.15a)

and the current density j(x) = −D∇n(x) .

(8.1.15b)

The resulting diﬀusion equation n(x) ˙ = D∇2 n(x)

(8.1.16)

has the one-dimensional solution N x2 n(x, t) = √ e− 4Dt . 4πDt

(8.1.17)

The particle number density n(x, t) from Eq. (8.1.17) describes the spreading out of N particles which were concentrated at x = 0 at the time t = 0 (n(x, 0) = N δ(x)). That is, the mean square displacement increases with time as 2Dt. (More general solutions of (8.1.16) can be found from (8.1.17) by superposition.) We can cast the Einstein relation in a more familiar form by introducing the mobility µ into (8.1.1) in place of the coeﬃcient of friction. The Langevin equation then reads m¨ x = −µ−1 x˙ + f with µ =

1 , ζm

(8.1.18)

and the Einstein relation takes on the form D = µkT .

(8.1.19)

The diﬀusion constant is thus proportional to the mobility of the particle and to the temperature.

414

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

Remarks: (i) In a simpliﬁed version of Einstein’s4 historical derivation of (8.1.19), we treat (instead of the osmotic pressure in a force ﬁeld) the dynamic origin of the barometric pressure formula. The essential consideration is that in a gravitational ﬁeld there are two currents which must compensate each other ∂ in equilibrium. They are the diﬀusion current −D ∂z n(z) and the current of particles falling in the gravitational ﬁeld, v¯n(z). Here, n(z) is the particle number density and v¯ is the mean velocity of falling, which, due to friction, is found from µ−1 v¯ = −mg. Since the sum of these two currents must vanish, we ﬁnd the condition ∂ −D n(z) − mgµn(z) = 0 . (8.1.20) ∂z mgz

From this, the barometric pressure formula n(z) ∝ e− kT is obtained if the Einstein relation (8.1.19) is fulﬁlled. (ii) In the Brownian motion of a sphere in a liquid with the viscosity constant η, the frictional force is given by Stokes’ law, Ffr = 6πaη x, ˙ where a is the radius and x˙ the velocity of the sphere. Then the diﬀusion constant is D = kT /6πaη and the mean square displacement of the sphere is given by 2 kT t x (t) = . (8.1.21) 3πaη Using this relation, an observation of x2 (t) allows the experimental determination of the Boltzmann constant k. 8.1.2 The Langevin Equation in a Force Field As a generalization of the preceding treatment, we now consider the Brownian motion in an external force ﬁeld ∂V F (x) = − . (8.1.22a) ∂x Then the Langevin equation is given by m¨ x = −mζ x˙ + F (x) + f (t) ,

(8.1.22b)

where we assume that the collisions and frictional eﬀects of the molecules are not modiﬁed by the external force and therefore the stochastic force f (t) again obeys (8.1.2), (8.1.3), and (8.1.8).5 An important special case of (8.1.22b) is the limiting case of strong damping ζ. When the inequality mζ x˙ m¨ x is fulﬁlled (as is the case e.g for periodic motion at low frequencies), it follows from (8.1.22b) that 4 5

See the reference at the end of this chapter. We will later see that the Einstein relation (8.1.8) ensures that the function p2 exp(−( 2m +V (x))/kT ) be an equilibrium distribution for this stochastic process.

8.1 Langevin Equations

x˙ = −Γ

∂V + r(t) , ∂x

415

(8.1.23)

where the damping constant Γ and the ﬂuctuating force r(t) are given by Γ ≡

1 1 and r(t) ≡ f (t) . mζ mζ

(8.1.24)

The stochastic force r(t), according to Eqns. (8.1.2) and (8.1.3), obeys the relation r(t) = 0 r(t)r(t ) = 2Γ kT δ(t − t ) .

(8.1.25)

For the characterization of the higher moments (correlation functions) of r(t), we will further assume in the following that r(t) follows a Gaussian distribution P[r(t)] = e−

R tf t0

dt

r2 (t) 4Γ kT

.

(8.1.26)

P[r(t)] gives the probability density for the values of r(t) in the interval [t0 , tf ], where t0 and tf are the initial and ﬁnal times. To deﬁne the functional integration, we subdivide the interval into N=

t f − t0 ∆

small subintervals of width ∆ and introduce the discrete times ti = t0 + i∆ ,

i = 0, . . . , N − 1 .

The element of the functional integration D[r] is deﬁned by * ) N −1 ∆ . dr(ti ) D[r] ≡ lim ∆→0 4Γ kT π i=0

(8.1.27)

The normalization of the probability density is * ) N −1 P r2 (ti ) ∆ e− i ∆ 4Γ kT = 1 . (8.1.28) dr(ti ) D[r] P[r(t)] ≡ lim ∆→0 4Γ kT π i=0 As a check, we calculate r(ti )r(tj ) =

4Γ kT δij δij = 2Γ kT → 2Γ kT δ(ti − tj ) , 2∆ ∆

which is in agreement with Eqns. (8.1.2), (8.1.3) and (8.1.8).

416

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

Since Langevin equations of the type (8.1.23) occur in a variety of diﬀerent physical situations, we want to add some elementary explanations. We ﬁrst consider (8.1.23) without the stochastic force, i.e. x˙ = −Γ ∂V ∂x . In regions of positive (negative) slope of V (x), x will be shifted in the negative (positive) x direction. The coordinate x moves in the direction of one of the minima of V (x) (see Fig. 8.4). At the extrema of V (x), x˙ vanishes. The eﬀect of the stochastic force r(t) is that the motion towards the minima becomes ﬂuctuating, and even at its extreme positions the particle is not at rest, but instead is continually pushed away, so that the possibility exists of a transition from one minimum into another. The calculation of such transition rates is of interest for, among other applications, thermally activated hopping of impurities in solids and for chemical reactions (see Sect. 8.3.2).

Fig. 8.4. The motion resulting from the equation of motion x˙ = −Γ ∂V /∂x.

8.2 The Derivation of the Fokker–Planck Equation from the Langevin Equation Next, we wish to derive equations of motion for the probability densities in the Langevin equations (8.1.1), (8.1.22b), and (8.1.23). 8.2.1 The Fokker–Planck Equation for the Langevin Equation (8.1.1) We deﬁne P (ξ, t) = δ ξ − v(t) ,

(8.2.1)

the probability density for the event that the Brownian particle has the velocity ξ at the time t. This means that P (ξ, t)dξ is the probability that the velocity lies within the interval [ξ, ξ + dξ].

8.2 Derivation of the Fokker–Planck Equation from the Langevin Equation

417

We now derive an equation of motion for P (ξ, t): ∂ ∂ P (ξ, t) = − δ ξ − v(t) v(t) ˙ ∂t ∂ξ 1 ∂ δ ξ − v(t) −ζv(t) + f (t) =− ∂ξ m 1 ∂ δ ξ − v(t) −ζξ + f (t) =− ∂ξ m 1 ∂ ∂ ζP (ξ, t)ξ − δ ξ − v(t) f (t) , = ∂ξ m ∂ξ

(8.2.2)

where the Langevin equation (8.1.1) has been inserted in the second line. To compute the last term, we require the probability density for the stochastic force, assumed to follow a Gaussian distribution: P[f (t)] = e−

R tf t0

dt

f 2 (t) 4ζmkT

.

(8.2.3)

The averages . . . are given by the functional integral with the weight (8.2.3) (see Eq. (8.1.26)). In particular, for the last term in (8.2.2), we obtain R f (t )2 dt δ ξ − v(t) f (t) = D[f (t )] δ ξ − v(t) f (t)e− 4ζmkT δ − R f (t )2 dt 4ζmkT e = −2ζmkT D[f (t )] δ ξ − v(t) δf (t) R f (t )2 dt δ δ ξ − v(t) = 2ζmkT D[f (t )] e− 4ζmkT δf (t) δ δv(t) ∂ = 2ζmkT δ ξ − v(t) = −2ζmkT δ ξ − v(t) . δf (t) ∂ξ δf (t) (8.2.4) Here, we have to use the solution (8.1.5) ∞ f (τ ) v(t) = v0 e−ζt + dτ G(t − τ ) m 0 and take the derivative with respect to f (t). With we obtain t δv(t) 1 1 = . dτ e−ζ(t−τ ) δ(t − τ ) = δf (t) m 2m 0

(8.1.5) δf (τ ) δf (t)

= δ(τ −t) and (8.1.4), (8.2.5)

The factor 12 results from the fact that the integration interval includes only half of the δ-function. Inserting (8.2.5) into (8.2.4) and (8.2.4) into (8.2.2), we obtain the equation of motion for the probability density, the Fokker–Planck equation: ∂ ∂ kT ∂ 2 P (v, t) = ζ vP (v, t) + ζ P (v, t) . ∂t ∂v m ∂v 2

(8.2.6)

418

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

Here, we have replaced the velocity ξ by v; it is not to be confused with the stochastic variable v(t). This relation can also be written in the form of an equation of continuity

∂ ∂ kT ∂ P (v, t) = −ζ −vP (v, t) − P (v, t) . (8.2.7) ∂t ∂v m ∂v Remarks: (i) The current density, the expression in large parentheses, is composed of a drift term and a diﬀusion current. (ii) The current density vanishes if the probability density has the form mv2

P (v, t) ∝ e− 2kT . The Maxwell distribution is thus (at least one) equilibrium distribution. Here, the Einstein relation (8.1.8) plays a decisive role. Conversely, we could have obtained the Einstein relation by requiring that the Maxwell distribution be a solution of the Fokker–Planck equation. (iii) We shall see in Sect. 8.3.1 that P (v, t) becomes the Maxwell distribution in the course of time, and that the latter is therefore the only equilibrium distribution of the Fokker–Planck equation (8.2.6).

8.2.2 Derivation of the Smoluchowski Equation for the Overdamped Langevin Equation, (8.1.23) For the stochastic equation of motion (8.1.23), x˙ = −Γ

∂V + r(t), ∂x

we can also deﬁne a probability density P (ξ, t) = δ ξ − x(t) ,

(8.1.23)

(8.2.8)

where P (ξ, t)dξ is the probability of ﬁnding the particle at time t at the position ξ in the interval dξ. We now derive an equation of motion for P (ξ, t), performing the operation (F (x) ≡ − ∂V ∂x ) ∂ ∂ P (ξ, t) = − δ ξ − x(t) x(t) ˙ ∂t ∂ξ ∂ δ ξ − x(t) Γ K(x) + r(t) =− ∂ξ ∂ ∂ Γ P (ξ, t)K(ξ) − δ ξ − x(t) r(t) . =− ∂ξ ∂ξ

(8.2.9)

The overdamped Langevin equation was inserted in the second line. For the last term, we ﬁnd in analogy to Eq. (8.2.4)

8.2 Derivation of the Fokker–Planck Equation from the Langevin Equation

419

δ δ ξ − x(t) δ ξ − x(t) r(t) = 2Γ kT δr(t) δx(t) ∂ ∂ δ ξ − x(t) = −Γ kT P (ξ, t) . (8.2.10) = −2Γ kT ∂ξ δr(t) ∂ξ Here, we have integrated (8.1.23) between 0 and t, t x(t) = x(0) + dτ Γ K x(τ ) + r(τ ) ,

(8.2.11)

0

from which it follows that t

∂Γ F (x(τ )) δx(τ ) δx(t) = + δ(t − τ ) dτ . δr(t ) ∂x(τ ) δr(t ) 0

(8.2.12)

δx(τ ) The derivative is δr(t ) = 0 for τ < t due to causality and is nonzero only for τ ≥ t , with a ﬁnite value at τ = t . We thus obtain t δx(t) ∂Γ F (x(τ )) δx(τ ) = dτ + 1 for t < t (8.2.13a) δr(t ) ∂x(τ ) δr(t ) 0

and δx(t) = δr(t )

?

for t = t .

(8.2.13b)

0 for t =t

This demonstrates the last step in (8.2.10). From (8.2.10) and (8.2.9), we obtain the equation of motion for P (ξ, t), the so called Smoluchowski equation ∂ ∂ ∂2 P (ξ, t) = − Γ P (ξ, t)F (ξ) + Γ kT 2 P (ξ, t) . ∂t ∂ξ ∂ξ

(8.2.14)

Remarks: (i) One can cast the Smoluchowski equation (8.2.14) in the form of an equation of continuity ∂ ∂ P (x, t) = − j(x, t) , ∂t ∂x with the current density

∂ − K(x) P (x, t) . j(x, t) = −Γ kT ∂x

(8.2.15a)

(8.2.15b)

The current density j(x, t) is composed of a diﬀusion term and a drift term, in that order. (ii) Clearly, P (x, t) ∝ e−V (x)/kT

(8.2.16)

is a stationary solution of the Smoluchowski equation. For this solution, j(x, t) is zero.

420

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

8.2.3 The Fokker–Planck Equation for the Langevin Equation (8.1.22b) For the general Langevin equation, (8.1.22b), we deﬁne the probability density P (x, v, t) = δ(x − x(t))δ(v − v(t)) .

(8.2.17)

Here, we must distinguish carefully between the quantities x and v and the stochastic variables x(t) and v(t). The meaning of the probability density P (x, v, t) can be characterized as follows: P (x, v, t)dxdv is the probability of ﬁnding the particle in the interval [x, x + dx] with a velocity in [v, v + dv]. The equation of motion of P (x, v, t), the generalized Fokker–Planck equation ∂ ∂P F (x) ∂P ∂ kT ∂ 2 P P +v + =ζ vP + (8.2.18) ∂t ∂x m ∂v ∂v m ∂v 2 follows from a series of steps similar to those in Sect. 8.2.2; see problem 8.1.

8.3 Examples and Applications In this section, the Fokker–Planck equation for free Brownian motion will be solved exactly. In addition, we will show in general for the Smoluchowski equation that the distribution function relaxes towards the equilibrium situation. In this connection, a relation to supersymmetric quantum mechanics will also be pointed out. Furthermore, two important applications of the Langevin equations or the Fokker–Planck equations will be given: the transition rates in chemical reactions and the dynamics of critical phenomena. 8.3.1 Integration of the Fokker–Planck Equation (8.2.6) We now want to solve the Fokker–Planck equation for the free Brownian motion, (8.2.6): " # ∂ kT ∂P P˙ (v) = ζ Pv + . (8.3.1) ∂v m ∂v mv2

We expect that P (v) will relax towards the Maxwell distribution, e− 2kT , following the relaxation law e−ζt . This makes it reasonable to introduce the variable ρ = veζt in place of v. Then we have P (v, t) = P (ρe−ζt , t) ≡ Y (ρ, t) , ∂Y ζt ∂ 2 P ∂P ∂ 2 Y 2ζt = e , = e , 2 ∂v ∂ρ ∂v ∂ρ2 ∂P ∂Y ∂ρ ∂Y ∂Y ∂Y = + = ζρ + . ∂t ∂ρ ∂t ∂t ∂ρ ∂t

(8.3.2a) (8.3.2b) (8.3.2c)

8.3 Examples and Applications

421

Inserting (8.3.2a–c) into (8.2.6) or (8.3.1), we obtain kT ∂ 2 Y 2ζt ∂Y e . = ζY + ζ ∂t m ∂ρ2 This suggests the substitution Y = χeζt . Due to that

(8.3.3) ∂Y ∂t

=

∂χ ζt ∂t e

+ ζY , it follows

∂χ kT ∂ 2 χ 2ζt =ζ e . ∂t m ∂ρ2

(8.3.4)

Now we introduce a new time variable by means of dϑ = e2ζt dt ϑ=

1 2ζt e −1 , 2ζ

(8.3.5)

where ϑ(t = 0) = 0. We then ﬁnd from (8.3.4) the diﬀusion equation kT ∂ 2 χ ∂χ =ζ ∂ϑ m ∂ρ2

(8.3.6)

with its solution known from (8.1.17), (ρ−ρ0 )2 kT 1 . e− 4qϑ ; q = ζ χ(ρ, ϑ) = √ m 4πqϑ

(8.3.7)

By returning to the original variables v and t, we ﬁnd the following solution # 12 m(v−v e−ζt )2 " 0 m − P (v, t) = χeζt = e 2kT (1−e−2ζt ) (8.3.8) −2ζt 2πkT (1 − e ) of the Fokker–Planck equation (8.2.6), which describes Brownian motion in the absence of external forces. The solution of the Smoluchowski equation (8.2.14) for a harmonic potential is also contained in (8.3.8). We now discuss the most important properties and consequences of the solution (8.3.8): In the limiting case t → 0, we have lim P (v, t) = δ(v − v0 ) .

(8.3.9a)

t→0

In the limit of long times, t → ∞, the result is lim P (v, t) = e−mv

t→∞

2

/2kT

m 12 . 2πkT

(8.3.9b)

Remark: Since P (v, t) has the property (8.3.9a), we also have found the conditional probability density in (8.3.8)6 6

The conditional probability P (v, t|v0 , t0 ) gives the probability that at time t the value v occurs, under the condition that it was v0 at the time t0 .

422

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations P (v, t|v0 , t0 ) = P (v, t − t0 ) .

(8.3.10)

This is not surprising. Since, as a result of (8.1.1), (8.1.2) and (8.1.3), a Markov process7 is speciﬁed, P (v, t|v0 , t0 ) likewise obeys the Fokker–Planck equation (8.2.6).

For an arbitrary integrable and normalized initial probability density ρ(v0 ) at time t0 dv0 ρ(v0 ) = 1 (8.3.11) we ﬁnd with (8.3.8) the time dependence ρ(v, t) = dv0 P (v, t − t0 )ρ(v0 ) .

(8.3.12)

Clearly, ρ(v, t) fulﬁlls the initial condition lim ρ(v, t) = ρ(v0 ) ,

(8.3.13a)

t→t0

while for long times mv2

lim ρ(v, t) = e− 2kT

t→∞

m 12 m 12 mv2 dv0 ρ(v0 ) = e− 2kT 2πkT 2πkT

(8.3.13b)

the Maxwell distribution is obtained. Therefore, for the Fokker–Planck equation (8.2.6), and for the Smoluchowski equation with an harmonic potential, (8.2.14), we have proved that an arbitrary initial distribution relaxes towards the Maxwell distribution, (8.3.13b). The function (8.3.8) is also used, by the way, in Wilson’s exact renormalization group transformation for the continuous partial elimination of short-wavelength critical ﬂuctuations.8 8.3.2 Chemical Reactions We now wish to calculate the thermally activated transition over a barrier (Fig. 8.5). An obvious physical application is the motion of an impurity atom in a solid from one local minimum of the lattice potential into another. Certain chemical reactions can also be described on this basis. Here, x refers to the reaction coordinate, which characterizes the state of the molecule. The vicinity of the point A can, for example, refer to an excited state of a molecule, while B signiﬁes the dissociated molecule. The transition from A to B takes place via conﬁgurations which have higher energies and is made possible by the thermal energy supplied by the surrounding medium. We formulate the following calculation in the language of chemical reactions. 7

A Markov process denotes a stochastic process in which all the conditional probabilities depend only on the last time which occurs in the conditions; e.g. P (t3 , v3 |t2 , v2 ; t1 , v1 ) = P (t3 , v3 |t2 , v2 ) ,

8

where t1 ≤ t2 ≤ t3 . K. G. Wilson and J. Kogut, Phys. Rep. 12C, 75 (1974).

8.3 Examples and Applications

423

Fig. 8.5. A thermally activated transition over a barrier from the minimum A into the minimum B

We require the reaction rate (also called the transition rate), i.e. the transition probability per unit time for the conversion of type A into type B. We assume that friction is so strong that we can employ the Smoluchowski equation (8.2.15a,b), ∂ P˙ = − j(x) . ∂x

(8.2.15a)

Integration of this equation between the points α and β yields d xβ dxP = −j(xβ ) + j(xα ) , dt xα

(8.3.14)

where xβ lies between the points A and B. It then follows that j(xβ ) is the transition rate between the states (the chemical species) A and B. To calculate j(xβ ), we assume that the barrier is suﬃciently high so that the transition rate is small. Then in fact all the molecules will be in the region of the minimum A and will occupy states there according to the thermal distribution. The few molecules which have reached state B can be imagined to be ﬁltered out. The strategy of our calculation is to ﬁnd a stationary solution P (x) which has the properties 1 −V (x)/kT e Z P (x) = 0

P (x) =

in the vicinity of A

(8.3.15a)

in the vicinity of B .

(8.3.15b)

From the requirement of stationarity, it follows that

∂ ∂V ∂ kT + P , 0=Γ ∂x ∂x ∂x from which we ﬁnd by integrating once

∂V ∂ + P = −j0 . Γ kT ∂x ∂x

(8.3.16)

(8.3.17)

The integration constant j0 plays the role of the current density which, owing to the fact that (8.2.14) is source-free between A and B, is independent of x.

424

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

This integration constant can be determined from the boundary conditions given above. We make use of the following Ansatz for P (x): P (x) = e−V /kT Pˆ

(8.3.18)

in equation (8.3.17) ∂ ˆ j0 V (x)/kT e . P =− ∂x kT Γ

(8.3.17a)

Integrating this equation from A to x, we obtain x j0 ˆ P (x) = const. − dx eV (x)/kT . kT Γ A

(8.3.17 b)

The boundary condition at A, that there P follows the thermal equilibrium distribution, requires that const. =

1 . dx e−V /kT A

(8.3.19a)

Here, A means that the integral is evaluated in the vicinity of A. If the barrier is suﬃciently high, contributions from regions more distant from the minimum are negligible9 . The boundary condition at B requires ) * B j0 −VB /kT V /kT 0=e const. − , (8.3.19b) dx e kT Γ A so that j0 =

kT Γ

−1 dx e−V (x)/kT A B dx eV (x)/kT A

.

(8.3.20)

For V (x) in the vicinity of A, we set VA (x) ≈ 12 (2πν)2 x2 , and, without loss of generality, take the zero point of the energy scale at the point A. We then ﬁnd √ ∞ kT −VA /kT − 12 (2πν)2 x2 /kT dx e = dx e = √ . 2πν A −∞ Here, the integration was extended beyond the neighborhood of A out to [−∞, ∞], which is permissible owing to the rapid decrease of the integrand. The main contribution to the integral in the denominator of (8.3.20) comes 9

Inserting (8.3.17 b) with (8.3.20) into (8.3.18), one obtains from the ﬁrst term in the vicinity of pointR A just the Requilibrium distribution, while the second term x B is negligible due to Adx eV /kT / A dx eV /kT 1.

8.3 Examples and Applications

425

from the vicinity of the barrier, where we set V (x) ≈ ∆ − (2πν )2 x2 /2. Here, 2 ∆ is the height of the barrier and ν characterizes the barrier’s curvature √ ∞ B )2 x2 ∆ kT V /kT ∆/kT − (2πν 2kT kT √ dx e ≈e dx e =e . 2πν A −∞ This yields all together for the current density or the transition rate 10 j0 = 2πνν Γ e−∆/kT .

(8.3.21)

We point out some important aspects of the thermally activated transition rate: the decisive factor in this result is the Arrhenius dependence e−∆/kT , where ∆ denotes the barrier height, i.e. the activation energy. We can rewrite 2 2 2 the prefactor by making the replacements (2πν) = mω 2 , (2πν ) = mω and 1 Γ = mζ (Eq. (8.1.24)): j0 =

ωω −∆/kT e . 2πζ

(8.3.21 )

If we assume that ω ≈ ω, then the prefactor is proportional to the square of the vibration frequency characterized by the potential well.11 8.3.3 Critical Dynamics We have already pointed out in the introduction to Brownian motion that the theory developed to describe it has a considerably wider signiﬁcance. Instead of the motion of a massive particle in a ﬂuid of stochastically colliding molecules, one can consider quite diﬀerent situations in which a small number of relatively slowly varying collective variables are interacting with many strongly varying, rapid degrees of freedom. The latter lead to a damping of the collective degrees of freedom. This situation occurs in the hydrodynamic region. Here, the collective degrees of freedom represent the densities of the conserved quantities. The typical time scales for these hydrodynamic degrees of freedom increase with decreasing q proportionally to 1/q or 1/q 2 , where q is the wavenumber. In comparison, in the range of small wavenumbers all the remaining degrees of freedom are very rapid and can be regarded as stochastic noise in the equations of motion of the conserved densities. This then leads to the typical form of the hydrodynamic equations with damping terms proportional to q 2 or, in real space, ∼ ∇2 . We emphasize that “hydrodynamics” is by no means limited to the domain of liquids or gases, but instead, in an extension of its 10 11

H. A. Kramers, Physica 7, 284 (1940) ω is the frequency (attempt frequency) with which the particle arrives at the right side of the potential well, from where it has the possibility (with however a small probability ∼ e−∆/kT ) of overcoming the barrier.

426

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

original meaning, it includes the general dynamics of conserved quantities depending on the particular physical situation (dielectrics, ferromagnets, liquid crystals, etc.). A further important ﬁeld in which this type of separation of time scales occurs is the dynamics in the neighborhood of critical points. As we know from the sections on static critical phenomena, the correlations of the local order parameter become long-ranged. There is thus a ﬂuctuating order within regions whose size is of the order of the correlation length. As these correlated regions grow, the characteristic time scale also increases. Therefore, the remaining degrees of freedom of the system can be regarded as rapidly varying. In ferromagnets, the order parameter is the magnetization. In its motions, the other degrees of freedom such as those of the electrons and lattice vibrations act as rapidly varying stochastic forces. In ferromagnets, the magnetic susceptibility behaves in the vicinity of the Curie point as χ∼

1 T − Tc

(8.3.22a)

and the correlation function of the magnetization as GMM (x) ∼

e−|x|/ξ . |x|

(8.3.22b)

In the neighborhood of the critical point of the liquid-gas transition, the isothermal compressibility diverges as κT ∼

1 T − Tc

(8.3.22c)

and the density-density correlation function has the dependence gρρ (x) ∼

e−|x|/ξ . |x|

(8.3.22d)

In Eqns. (8.3.22 b,d), ξ denotes the correlation length, which behaves as −1 ξ ∼ (T − Tc ) 2 in the molecular ﬁeld approximation, cf. Sects. 5.4 and 6.5. A general model-independent approach to the theory of critical phenomena begins with a continuum description of the free energy, the Ginzburg– Landau expansion (see Sect. 7.4.1): " # a b 4 c d 2 2 F [M ] = d x (T − Tc )M + M + (∇M ) − M h , (8.3.23) 2 4 2 where e−F /kT denotes the statistical weight of a conﬁguration M (x). The most probable conﬁguration is given by δF = 0 = a (T − Tc )M − c∇2 M + bM 3 − h . δM

(8.3.24)

8.3 Examples and Applications

427

It follows from this that the magnetization and the susceptibility in the limit h → 0 are M ∼ (Tc − T )1/2 Θ(Tc − T ) and χ ∼

1 . T − Tc

Since the correlation length diverges on approaching the critical point, ξ → ∞, the ﬂuctuations also become slow. This suggests the following stochastic equation of motion for the magnetization12 M˙ (x, t) = −λ

δF + r(x, t) . δM (x, t)

(8.3.25)

The ﬁrst term in the equation of motion causes relaxation towards the minimum of the free-energy functional. This thermodynamic force becomes stronger as the gradient δF /δM (x) increases. The coeﬃcient λ characterizes the relaxation rate analogously to Γ in the Smoluchowski equation. Finally, r(x, t) is a stochastic force which is caused by the remaining degrees of freedom. Instead of a ﬁnite number of stochastic variables, we have here stochastic variables M (x, t) and r(x, t) which depend on a continuous index, the position x. Instead of M (x), we can also introduce its Fourier transform Mk = dd x e−ikx M (x) (8.3.26) and likewise for r(x, t). Then the equation of motion (8.3.25) becomes ∂F M˙ k = −λ + rk (t) . ∂M−k

(8.3.25 )

Finally, we still have to specify the properties of the stochastic forces. Their average value is zero r(x, t) = rk (t) = 0 and furthermore they are correlated spatially and temporally only over short distances, which we can represent in idealized form by rk (t)rk (t ) = 2λkT δk,−k δ(t − t )

(8.3.27)

r(x, t)r(x , t ) = 2λkT δ(x − x )δ(t − t ) .

(8.3.27 )

or

For the mean square deviations of the force, we have postulated the Einstein relation, which guarantees that an equilibrium distribution is given by 12

Also called the TDGL = time-dependent Ginzburg–Landau model.

428

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

e−βF [M] . We also assume that the probability density for the stochastic forces r(x, t) is a Gaussian distribution (cf. (8.1.26)). This has the result that the odd correlation functions for r(x, t) vanish and the even ones factor into products of (8.3.27 ) (sum over all the pairwise contractions). We will now investigate the equation of motion (8.3.25 ) for T > Tc . In what follows, we use the Gaussian approximation, i.e. we neglect the anharmonic terms; then the equation of motion simpliﬁes to M˙ k = −λ a (T − Tc ) + ck 2 Mk + rk . (8.3.28) Its solution is already familiar from the elementary theory of Brownian motion: t −γk t −γk t Mk (t) = e Mk (0) + e dt rk (t )eγk t , (8.3.29) 0

as is the resulting correlation function

Mk (t)Mk (t ) = e−γk |t−t |

λkT δk,−k + O(e−γk (t+t ) ) γk

(8.3.30)

kT e−γk |t−t | . 2 − Tc ) + ck

(8.3.31)

or, for times t, t > γk−1 , Mk (t)Mk (t ) = δk,−k

a (T

Here, we have introduced the relaxation rate γk = λ a (T − Tc ) + ck 2 .

(8.3.32a)

In particular, for k = 0 we ﬁnd γ0 ∼ (T − Tc ) ∼ ξ −2 .

(8.3.32b)

As we suspected at the beginning, the relaxation rate decreases drastically on approaching the critical point. One denotes this situation as “critical slowing down”. As we already know from Chap. 7, the interaction bM 4 between the critical ﬂuctuations leads to a modiﬁcation of the critical exponents, e.g. ξ → (T − Tc )−ν . Likewise, in the framework of dynamic renormalization group theory it is seen that these interactions lead in the dynamics to γ0 → ξ −z

(8.3.33)

with a dynamic critical exponent z 13 which diﬀers from 2. 13

See e.g. F. Schwabl and U. C. T¨ auber, Encyclopedia of Applied Physics, Vol. 13, 343 (1995), VCH.

8.3 Examples and Applications

429

Remark: According to Eq. (8.3.25), the dynamics of the order parameter are relaxational. For isotropic ferromagnets, the magnetization is conserved and the coupled precessional motion of the magnetic moments leads to spin waves. In this case, the equations of motion are given by14 δF δF ˙ (x, t) + Γ ∇2 (x, t) + r(x, t) , M(x, t) = −λM (x, t) × δM δM

(8.3.34)

with r(x, t) = 0 ,

(8.3.35)

ri (x, t)rj (x, t) = −2Γ kT ∇2δ (3) (x − x )δ(t − t )δij ,

(8.3.36)

which leads to spin diﬀusion above the Curie temperature and to spin waves below it (cf. problem 8.9). The ﬁrst term on the right-hand side of the equation of motion produces the precessional motion of the local magnetization M(x, t) around the local ﬁeld δF /δM(x, t) at the point x. The second term gives rise to the damping. Since the magnetization is conserved, it is taken to be proportional to ∇2 , i.e. in Fourier space it is proportional to k 2 . These equations of motion are known as the Bloch equations or Landau–Lifshitz equations and, without the stochastic term, have been applied in solid-state physics since long before the advent of interest in critical dynamic phenomena. The stochastic force r(x, t) is due to the remaining, rapidly ﬂuctuating degrees of freedom. The functional of the free energy is F [M(x, t)] =

1 2

b d3 x a (T − Tc )M2 (x, t) + M4 (x, t) 2

+ c(∇M(x, t))2 − hM(x, t) . (8.3.37)

∗

8.3.4 The Smoluchowski Equation and Supersymmetric Quantum Mechanics 8.3.4.1 The Eigenvalue Equation In order to bring the Smoluchowski equation (8.2.14) (V ≡ ∂V /∂x) ≡ −F

∂P ∂ ∂ =Γ kT +V P (8.3.38) ∂t ∂x ∂x into a form which contains only the second derivative with respect to x, we apply the Ansatz 14

S. Ma and G. F. Mazenko, Phys. Rev. B11, 4077 (1975).

430

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

P (x, t) = e−V (x)/2kT ρ(x, t) ,

(8.3.39)

obtaining ∂ρ = kT Γ ∂t

)

∂2 V V − + ∂x2 2kT 4(kT )2 2

* ρ.

This is a Schr¨odinger equation with an imaginary time

1 ∂2 ∂ρ 0 = − i + V (x) ρ. ∂(−i2kT Γ t) 2 ∂x2 with the potential @ 2 V 1 V 0 . − V (x) = 2 4(kT )2 2kT

(8.3.40)

(8.3.41)

(8.3.42)

Following the separation of the variables ρ(x, t) = e−2kT Γ En t ϕn (x) ,

(8.3.43)

we obtain from Eq. (8.3.40) the eigenvalue equation 1 ϕn = −En + V 0 (x) ϕn (x) . 2

(8.3.44)

Formally, equation (8.3.44) is identical with a time-independent Schr¨ odinger equation.15 In (8.3.43) and (8.3.44), we have numbered the eigenfunctions and eigenvalues which follow from (8.3.44) with the index n. The ground state of (8.3.44) is given by ϕ0 = N e− 2kT , E0 = 0 , V

(8.3.45)

where N is a normalization factor. Inserting in (8.3.39), we ﬁnd for P (x, t) the equilibrium distribution P (x, t) = N e−V (x)/kT .

(8.3.45 )

From (8.3.42), we can immediately see the connection with supersymmetric quantum mechanics. The supersymmetric partner16 to V 0 has the potential 2 V 1 V 1 . (8.3.46) + V = 2 4(kT )2 2kT 15 16

N. G. van Kampen, J. Stat. Phys. 17, 71 (1977). M. Bernstein and L. S. Brown, Phys. Rev. Lett. 52, 1933 (1984); F. Schwabl, QM I, Chap. 19, Springer 2005. The quantity Φ introduced there is connected to the ground state wavefunctions ϕ0 and the potential V as follows: Φ = −ϕ0 /ϕ0 = V /2kT .

8.3 Examples and Applications

431

Fig. 8.6. The excitation spectra of the two Hamiltonians H 0 and H 1 , from QM I, pp. 353 and 361

The excitation spectra of the two Hamiltonians H 0,1 = −

1 d2 + V 0,1 (x) 2 dx2

(8.3.47)

are related in the manner shown in Fig. 8.6. One can advantageously make use of this connection if the problem with H 1 is simpler to solve than that with H 0 . 8.3.4.2 Relaxation towards Equilibrium We can now solve the initial value problem for the Smoluchowski equation in general. Starting with an arbitrarily normalized initial distribution P (x), we can calculate ρ(x) and expand in the eigenfunctions of (8.3.44) ρ(x) = eV (x)/2kT P (x) = cn ϕn (x) , (8.3.48) n

with the expansion coeﬃcients cn = dx ϕ∗n (x)eV (x)/2kT P (x) .

(8.3.49)

From (8.3.43), we ﬁnd the time dependence e−2kT Γ En t cn ϕn (x) , ρ(x, t) =

(8.3.50)

n

from which, with (8.3.39), P (x, t) = e

−V (x)/2kT

∞ n=0

cn e−2kT Γ En t ϕn (x)

(8.3.51)

432

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

follows. The normalized ground state has the form ϕ0 =

e−V (x)/2kT dx e−V (x)/kT

1/2 .

(8.3.52)

Therefore, the expansion coeﬃcient c0 is given by c0 =

dx ϕ∗0 eV (x)/2kT P (x)

=

dx P (x)

1/2 dx e−V (x)/kT

=

1

1/2 dx e−V (x)/kT

.

(8.3.53) This allows us to cast (8.3.51) in the form P (x, t) =

∞ e−V (x)/kT −V (x)/2kT + e cn e−2kT Γ En t ϕn (x) . dx e−V (x)/kT n=1

(8.3.54)

With this, the initial-value problem for the Smoluchowski equation is solved in general. Since En > 0 for n ≥ 1, it follows from this expansion that lim P (x, t) =

t→∞

e−V (x)/kT , dx e−V (x)/kT

(8.3.55)

which means that, starting from an arbitrary initial distribution, P (x, t) develops at long times towards the equilibrium distribution (8.3.45 ) or (8.3.55).

Literature A. Einstein, Ann. d. Physik 17, 182 (1905); reprinted in Annalen der Physik 14, Supplementary Issue (2005). R. Becker, Theorie der W¨ arme, 3. Auﬂ., Springer Verlag, Heidelberg 1985, Chap. 7; R. Becker, Theory of Heat, 2nd Ed., Springer, Berlin, Heidelberg, New York 1967 H. Risken, The Fokker–Planck Equation, Springer Verlag, Heidelberg, 1984

N. G. van Kampen, Stochastic Processes in Physics and Chemistry, North Holland, Amsterdam, 1981

Problems for Chapter 8 8.1 Derive the generalized Fokker–Planck equation, (8.2.18). 8.2 A particle is moving with the step length l along the x-axis. Within each time step it hops to the right with the probability p+ and to the left with the probability p− (p+ + p− = 1). How far is it from the starting point on the average after t time steps if p+ = p− = 1/2, or if p+ = 3/4 and p− = 1/4?

Problems for Chapter 8

433

8.3 Diﬀusion and Heat Conductivity (a) Solve the diﬀusion equation n˙ = D∆n for d = 1, 2 and 3 dimensions with the initial condition n(x, t = 0) = N δ d (x) . Here, n is the particle density, N the particle number, and D is the diﬀusion constant. (b) Another form of the diﬀusion equation is the heat conduction equation ∆T =

cρ ∂T κ ∂t

where T is the temperature, κ the coeﬃcient of thermal conductivity, c the speciﬁc heat, and ρ the density. Solve the following problem as an application: potatoes are stored at +5◦ C in a broad trench which is covered with a loose layer of earth of thickness d. Right after they are covered, a cold period suddenly begins, with a steady temperature of −10◦ C, and it lasts for two months. How thick does the earth layer have to be so that the potatoes will have cooled just to 0◦ C at the end of the two months? Assume as an approximation that the same values hold for the earth and for the kg W J , c = 2000 kg·K , ρ = 1000 m potatoes: κ = 0.4 m·K 3.

8.4 Consider the Langevin equation of an overdamped harmonic oscillator x(t) ˙ = −Γ x(t) + h(t) + r(t), where h(t) is an external force and r(t) a stochastic force with the properties (8.1.25). Compute the correlation function ˙ ¸ C(t, t ) = x(t)x(t ) h=0 , the response function χ(t, t ) =

δx(t) , δh(t )

and the Fourier transform of the response function.

8.5 Damped Oscillator (a) Consider the damped harmonic oscillator m¨ x + mζ x˙ + mω02 x = f (t) with the stochastic force f (t) from Eq. (8.1.25). Calculate the correlation function and the dynamic susceptibility. Discuss in particular the position of the poles and the line shape. What changes relative to the limiting cases of the non-damped oscillator or the overdamped oscillator? (b) Express the stationary solution x(t) under the action of a periodic external t in terms of the dynamic susceptibility. Use it to compute force fe (t) = f0 cos 2π T RT the power dissipated, T1 0 dt fe (t)x(t). ˙

434

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

8.6 Diverse physical systems can be described as a subsystem capable of oscillations that is coupled to a relaxing degree of freedom, whereby both systems are in contact with a heat bath (e.g. the propagation of sound waves in a medium in which chemical reactions are taking place, or the dynamics of phonons taking energy/heat diﬀusion into account). As a simple model, consider the following system of coupled equations: 1 p m p˙ = −mω02 x − Γ p + by + R(t) b y˙ = −γy − p + r(t) . m

x˙ =

Here, x and p describe the vibrational degrees of freedom (with the eigenfrequency ω0 ), and y is the relaxational degree of freedom. The subsystems are mutually linearly coupled with their coupling strength determined by the parameter b. The coupling to the heat bath is accomplished by the stochastic forces R and r for each subsystem, with the usual properties (vanishing of the average values and the Einstein relations), and the associated damping coeﬃcients Γ and γ. (a) Calculate the dynamic susceptibility χx (ω) for the vibrational degree of freedom. (b) Discuss the expression obtained in the limiting case of γ → 0, i.e. when the relaxation time of the relaxing system is very long.

8.7 An example of an application of the overdamped Langevin equation is an electrical circuit consisting of a capacitor of capacity C and a resistor of resistance R which is at the temperature T . The voltage drop UR over the resistor depends on the current I via UR = RI, and the voltage UC over the capacitor is related to . On the average, the sum of the two voltages is the capacitor charge Q via UC = Q C zero, UR + UC = 0. In fact, the current results from the motion of many electrons, and collisions with the lattice ions and with phonons cause ﬂuctuations which are ˙ modeled by a noise term Vth in the voltage balance (J = Q) 1 RQ˙ + Q = Vth C or 1 1 Uc = Vth . U˙ c + RC RC (a) Assume the Einstein relation for the stochastic force and calculate the spectral distribution of the voltage ﬂuctuations Z ∞ φ(ω) = dt eiωt Uc (t)Uc (0) . −∞

(b) Compute ˙ 2¸ Uc ≡ Uc (t)Uc (t) ≡ and interpret the result,

Z

1 C 2

∞

dω φ(ω) −∞

˙ 2¸ Uc = 12 kT .

Problems for Chapter 8

435

8.8 In a generalization of problem 8.7, let the circuit now contain also a coil or ˙ The equation of motion inductor of self-inductance L with a voltage drop UL = LI. for the charge on the capacitor is ¨ + RQ˙ + 1 Q = Vth . Q C By again assuming the Einstein relation R ∞ for the noise voltage Vth , calculate the spectral distribution for the current −∞ dt eiωt I(t)I(0).

8.9 Starting from the equations of motion for an isotropic ferromagnet (Eq. 8.3.34), investigate the ferromagnetic phase, in which M(x, t) = eˆz M0 + δM(x, t) holds. (a) Linearize the equations of motion in δM(x, t), and determine the transverse and longitudinal excitations relative to the z-direction. (b) Calculate the dynamic susceptibility Z ∂Mi (x, t) χij (k, ω) = d3x dt e−i(kx−ωt) ∂hj (0, 0) and the correlation function Z Gij (k, ω) = d3x dt e−i(kx−ωt) δMi (x, t)δMj (0, 0) .

8.10 Solve the Smoluchowski equation ∂ ∂P (x, t) =Γ ∂t ∂x

„ « ∂ ∂V (x) kT + P (x, t) ∂x ∂x 2

x2 , by for an harmonic potential and an inverted harmonic potential V (x) = ± mω 2 solving the corresponding eigenvalue problem.

8.11 Justify the Ansatz of Eq. (8.3.39) and carry out the rearrangement to give Eq. (8.3.40). 8.12 Solve the Smoluchowski equation for the model potential V (x) = 2kT log(cosh x) using supersymmetric quantum mechanics, by transforming as in Chapter 8.3.4 to a Schr¨ odinger equation. (Literature: F. Schwabl, Quantum Mechanics, 3rd ed., Chap. 19 (Springer Verlag, Heidelberg, New York, corrected printing 2005.)

8.13 Stock-market prices as a stochastic process. Assume that the logarithm l(t) = log S(t) of the price S(t) of a stock obeys the Langevin equation (on a suﬃciently rough time scale) d l(t) = r + Γ (t) dt where r is a constant and Γ is a Gaussian “random force” with Γ (t)Γ (t ) = σ 2 δ(t − t ).

436

8. Brownian Motion, Equations of Motion, the Fokker–Planck Equations

(a) Explain this approach. Hints: What does the assumption that prices in the future cannot be predicted from the price trends in the past imply? Think ﬁrst of a process which is discrete in time (e.g. the time dependence of the daily closing rates). Should the transition probability more correctly be a function of the price diﬀerence or of the price ratio? (b) Write the Fokker–Planck equation for l, and based on it, the equation for S. (c) What is the expectation value for the market price at the time t, when the stock is being traded at the price S0 at time t0 = 0? Hint: Solve the Fokker–Planck equation for l = log S.

9. The Boltzmann Equation

9.1 Introduction In the Langevin equation (Chap. 8), irreversibility was introduced phenomenologically through a damping term. Kinetic theories have the goal of explaining and quantitatively calculating transport processes and dissipative eﬀects due to scattering of the atoms (or in a solid, of the quasiparticles). The object of these theories is the single-particle distribution function, whose time development is determined by the kinetic equation. In this chapter, we will deal with a monatomic classical gas consisting of particles of mass m; we thus presume that the thermal wavelength λT = √ 2π/ 2πmkT and the volume per particle v = n−1 obey the inequality λT n−1/3 , i.e. the wavepackets are so strongly localized that the atoms can be treated classically. Further characteristic quantities which enter include the duration of a collision τc and the collision time τ (this is the mean time between two collisions of an atom; see (9.2.12)). We have τc ≈ rc /¯ v and τ ≈ 1/nrc2 v¯, where rc is the range of the potentials and v¯ is the average velocity of the particles. In order to be able to consider independent two-particle collisions, we need the additional condition τc τ , i.e. the duration of a collision is short in comparison to the collision time. This condition is fulﬁlled in the low-density limit, rc n−1/3 . Then collisions of more than two particles can be neglected. The kinetic equation which describes the case of a dilute gas considered here is the Boltzmann equation 1 . The Boltzmann equation is one of the most fundamental equations of non-equilibrium statistical mechanics and is applied in areas far beyond the case of the dilute gas2 . 1

2

Ludwig Boltzmann, Wien. Ber. 66, 275 (1872); Vorlesungen u ¨ber Gastheorie, Leipzig, 1896; Lectures on Gas Theory, translated by S. Brush, University of California Press, Berkeley, 1964 See e.g. J. M. Ziman, Principles of the Theory of solids, 2nd ed, Cambridge Univ. Press, Cambridge, 1972.

438

9. The Boltzmann Equation

In this chapter we will introduce the Boltzmann equation using the classical derivation of Boltzmann1 . Next, we discuss some fundamental questions relating to irreversibility based on the H theorem. As an application of the Boltzmann equation we then determine the hydrodynamic equations and their eigenmodes (sound, heat diﬀusion). The transport coeﬃcients are derived systematically from the linearized Boltzmann equation using its eigenmodes and eigenfrequencies.

9.2 Derivation of the Boltzmann Equation We presume that only one species of atoms is present. For these atoms, we seek the equation of motion of the single-particle distribution function. Deﬁnition: The single-particle distribution function f (x, v, t) is deﬁned by f (x, v, t) d3 x d3 v = the number of particles which are found at time t in the volume element d3 x around the point x and d3 v around the velocity v. d3 x d3 v f (x, v, t) = N . (9.2.1) The single-particle distribution function f (x, v, t) is related to the N -particle distribution function ρ(x1 , v1 , . . . , xN , vN , t) (Eq. (2.3.1)) through f (x1 , v1 , t) = N d3 x2 d3 v2 . . . d3 xN d3 vN ρ(x1 , v1 , . . . , xN , vN , t). Remarks: 1. In the kinetic theory, one usually takes the velocity as variable instead of the momentum, v = p/m. 2. The 6-dimensional space generated by x and v is called µ space. 3. The volume elements d3 x and d3 v are supposed to to be of small linear dimensions compared to the macroscopic scale or to the mean velocity v¯ = kT /m , but large compared to the microscopic scale, so that many particles are to be found within each element. In a gas under standard conditions (T = 1◦ C, P = 1 atm), the number of molecules per cm3 is n = 3 × 1019 . In a cube of edge length 10−3 cm, i.e. a volume element of the size d3 x = 10−9 cm3 , which for all experimental purposes can be considered to be pointlike, there are still 3 × 1010 molecules. If we choose d3 v ≈ 10−6 × v¯3 , then from the Maxwell distribution f 0 (v) = n

m 3/2 mv2 e− 2kT , 2πkT

in this element of µ space, there are f 0 d3 x d3 v ≈ 104 molecules. To derive the Boltzmann equation, we follow the motion of a volume element in µ space during the time interval [t, t + dt]; cf. Fig. 9.1. Since those

9.2 Derivation of the Boltzmann Equation

439

Fig. 9.1. Deformation of a volume element in µ space during the time interval dt.

particles with a higher velocity move more rapidly, the volume element is deformed in the course of time. However, the consideration of the sizes of the two parallelepipeds3 yields d3 x d3 v = d3 x d3 v .

(9.2.2)

The number of particles at the time t in d3 x d3 v is f (x, v, t) d3 x d3 v, and the number of particles in the volume element which develops after the time 1 interval dt is f (x + vdt, v + m Fdt, t + dt) d3 x d3 v . If the gas atoms were collision-free, these two numbers would be the same. A change in these particle numbers can only occur through collisions. We thus obtain 1 f (x + v dt, v + F dt, t + dt) − f (x, v, t) d3 x d3 v = m ∂f = dt d3 x d3 v , (9.2.3) ∂t coll i.e. the change in the particle number is equal to its change due to collisions. The expansion of this balance equation yields ∂ ∂f 1 + v∇x + F(x)∇v f (x, v, t) = . (9.2.4) ∂t m ∂t coll The left side of this equation is termed the ﬂow term 4 . The collision term ∂f can be represented as the diﬀerence of gain and loss processes: ∂t coll

∂f ∂t

=g−l.

(9.2.5)

coll

Here, g d3 x d3 v dt is the number of particles which are scattered during the time interval dt into the volume d3 x d3 v by collisions, and ld3 x d3 v dt is the 3

4

The result obtained here from geometric considerations can also be derived by using Liouville’s theorem (L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics, Vol. I: Mechanics, Pergamon Press, Oxford 1960, Eq. (4.6.5)). In Remark (i), p. 441, the ﬂow term is derived in a diﬀerent way.

440

9. The Boltzmann Equation

number which are scattered out, i.e. the number of collisions in the volume element d3 x in which one of the two collision partners had the velocity v before the collision. We assume here that the volume element d3 v is so small in velocity space that every collision leads out of this volume element. The following expression for the collision term is Boltzmann’s celebrated Stosszahlansatz (assumption regarding the number of collisions): ∂f ∂t

=

d3 v2 d3 v3 d3 v4 W (v, v2 ; v3 , v4 )[f (x, v3 , t)f (x, v4 , t)

coll

− f (x, v, t)f (x, v2 , t)] . (9.2.6) Here, W (v, v2 ; v3 , v4 ) refers to the transition probability v, v2 → v3 , v4 ,

Fig. 9.2. Gain and loss processes, g and l

i.e. the probability that in a collision two particles with the velocities v and v2 will have the velocities v3 and v4 afterwards. The number of collisions which lead out of the volume element considered is proportional to the number of particles with the velocity v and the number of particles with velocity v2 , and proportional to W (v, v2 ; v3 , v4 ); a sum is carried out over all values of v2 and of the ﬁnal velocities v3 and v4 . The number of collisions in which an additional particle is in the volume element d3 v after the collision is given by the number of particles with the velocities v3 and v4 whose collision yields a particle with the velocity v. Here, the transition probability W (v3 , v4 ; v, v2 ) has been expressed with the help of (9.2.8e). The Stosszahlansatz (9.2.6), together with the balance equation (9.2.4), yields the Boltzmann equation

∂ 1 + v∇x + F(x)∇v f (x, v, t) = ∂t m 3 3 d v2 d v3 d3 v4 W (v, v2 ; v3 , v4 ) f (x, v3 , t)f (x, v4 , t) − f (x, v, t)f (x, v2 , t) . (9.2.7) It is a nonlinear integro-diﬀerential equation. The transition probability W has the following symmetry properties: • Invariance under particle exchange: W (v, v2 ; v3 , v4 ) = W (v2 , v; v4 , v3 ) .

(9.2.8a)

9.2 Derivation of the Boltzmann Equation

441

• Rotational and reﬂection invariance: with an orthogonal matrix D we have W (Dv, Dv2 ; Dv3 , Dv4 ) = W (v, v2 ; v3 , v4 ) .

(9.2.8b)

This relation contains also inversion symmetry: W (−v, −v2 ; −v3 , −v4 ) = W (v, v2 ; v3 , v4 ) .

(9.2.8c)

• Time-reversal invariance: W (v, v2 ; v3 , v4 ) = W (−v3 , −v4 ; −v, −v2 ) .

(9.2.8d)

The combination of inversion and time reversal yields the relation which we have already used in (9.2.6) for ∂f : ∂t coll

W (v3 , v4 ; v, v2 ) = W (v, v2 ; v3 , v4 ) .

(9.2.8e)

From the conservation of momentum and energy, it follows that W (v1 , v2 ; v3 , v4 ) = σ(v1 , v2 ; v3 , v4 )δ (3) (p1 + p2 − p3 − p4 )

2 p2 p2 p2 p1 + 2 − 3 − 3 , (9.2.8f) ×δ 2m 2m 2m 2m as one can see explicitly from the microscopic calculation of the two-particle collision in Eq. (9.5.21). The form of the scattering cross-section σ depends on the interaction potential between the particles. For all the general, fundamental results of the Boltzmann equation, the exact form of σ is not important. As an explicit example, we calculate σ for the interaction potential of hard spheres (Eq. (9.5.15)) and for a potential which falls oﬀ algebraically (problem 9.15, Eq. (9.5.29)). To simplify the notation, in the following we shall frequently use the abbreviations f1 ≡ f (x, v1 , t) with v1 = v, f2 ≡ f (x, v2 , t),

f3 ≡ f (x, v3 , t),

and

f4 ≡ f (x, v4 , t) .

(9.2.9)

Remarks: (i) The ﬂow term in the Boltzmann equation can also be derived by setting up an equation of continuity for the ﬁctitious case of collision-free, non-interacting gas atoms. To do this, we introduce the six-dimensional velocity vector « „ F ˙ v˙ = (9.2.10) w = v = x, m and the current density wf (x, v, t). For a collision-free gas, f fulﬁlls the equation of continuity ∂f + div wf = 0 . ∂t

(9.2.11)

442

9. The Boltzmann Equation

Using Hamilton’s equations of motion, Eq. (9.2.11) takes on the form « „ ∂ 1 + v∇ x + F(x)∇v f (x, v, t) = 0 ∂t m

(9.2.11 )

of the ﬂow term in Eqns. (9.2.4) and (9.2.7). (ii) With a collision term of the form (9.2.6), the presence of correlations between two particles has been neglected. It is assumed that at each instant the number of particles with velocities v3 and v4 , or v and v2 , is uncorrelated, an assumption which is also referred to as molecular chaos. A statistical element is introduced here. As a justiﬁcation, one can say that in a gas of low density, a binary collision between two molecules which had already interacted either directly or indirectly through a common set of molecules is extremely improbable. In fact, molecules which collide come from quite diﬀerent places within the gas and previously underwent collisions with completely diﬀerent molecules, and are thus quite uncorrelated. The assumption of molecular chaos is required only for the particles before a collision. After a collision, the two particles are correlated (they move apart in such a manner that if all motions were reversed, they would again collide); however, this does not enter into the equation. It is possible to derive the Boltzmann equation approximately from the Liouville equation. To this end, one derives from the latter the equations of motion for the single-, two-, etc. -particle distribution functions. The structure of these equations, which is also called the BBGKY (Bogoliubov, Born, Green, Kirkwood, Yvon) hierarchy, is such that the equation of motion for the r-particle distribution function (r = 1, 2, . . .) contains in addition also the (r + 1)-particle distribution function5 . In particular, the equation of motion for the single-particle distribution function f (x, v, t) has the form of the left side of the Boltzmann equation. The right side however contains f2 , the twoparticle distribution function, and thus includes correlations between the particles. Only by an approximate treatment, i.e. by truncating the equation of motion for f2 itself, does one obtain an expression which is identical with the collision term of the Boltzmann equation6 . It should be mentioned that terms beyond those in the Boltzmann equation lead to phenomena which do not exhibit the usual exponential decay in their relaxation behavior, but instead show a much slower, algebraic behavior; these time dependences are called “long time tails”. Considered microscopically, they result from so called ring collisions; see the reference by J. A. McLennan at the end of this chapter. Quantitatively, these eﬀects are in reality immeasurably small; up to now, they have been observed only in computer experiments. In this sense, they have a similar fate to the deviations from exponential decay of excited quantum levels which occur in quantum mechanics. (iii) To calculate the collision time τ , we imagine a cylinder whose length is equal to the distance which a particle with thermal velocity travels in unit time, and whose basal area is equal to the total scattering cross-section. An atom with a thermal velocity passes through this cylinder in a unit time and collides with all the other atoms within the cylinder. The number of atoms within the cylinder and thus the number of collisions of an atom per second is σtot v¯n, and it follows that the mean collision time is 1 τ = . (9.2.12) σtot v¯n 5

6

The r-particle distribution function is obtained from the -particle distribution R N N! d3 xr+1 d3 vr+1 d3 xN d3 vN function by means of fr (x1 , v1 , . . . xr , vr , t) ≡ (N−r)! ρ(x1 , v1 , . . . xN , vN , t). The combinatorial factor results from the fact that it is not important which of the particles is at the µ-space positions x1 , v1 , . . .. See references at the end of this chapter, e.g. K. Huang, S. Harris.

9.3 Consequences of the Boltzmann Equation

443

The mean free path l is deﬁned as the distance which an atom typically travels between two successive collisions; it is given by l ≡ v¯τ =

1 . σtot n

(9.2.13)

(iv) Estimates of the lengths and times which play a role in setting up the Boltzmann equation: the range rc of the potential must be so short that collisions occur between only those molecules which are within the same volume element d3 x: rc dx. This inequality is obeyed for the numerical example rc ≈ 10−8 cm, cm , we obtain for the time during which the particle dx = 10−3 cm. With v¯ ≈ 105 sec is within d3 x the value τd3 x ≈ τc ≈ 19

10−8 cm cm 105 sec −3

10 cm

10−3 cm cm 105 sec

≈ 10−8 sec. The duration of a collision is

≈ 10−13 sec, the collision time τ ≈ (rc2 n¯ v )−1 ≈ (10−16 cm2 × 3 ×

× 105 cm sec−1 )−1 ≈ 3 × 10−9 sec.

9.3 Consequences of the Boltzmann Equation 9.3.1 The H-Theorem7 and Irreversibility The goal of this section is to show that the Boltzmann equation shows irreversible behavior, and the distribution function tends towards the Maxwell distribution. To do this, Boltzmann introduced the quantity H, which is related to the negative of the entropy: 7 H(x, t) = d3 v f (x, v, t) log f (x, v, t) . (9.3.1) For its time derivative, one obtains from the Boltzmann equation (9.2.7) ˙ H(x, t) = d3 v (1 + log f )f˙

1 3 = − d v (1 + log f ) v∇x + F∇v f − I (9.3.2) m = −∇x d3 v (f log f ) v − I . The second term in the large brackets in the second line is proportional to d3 v ∇v (f log f ) and vanishes, since there are no particles with inﬁnite velocities, i.e. f → 0 for v → ∞. 7

Occasionally, the rumor makes the rounds that according to Boltzmann, this should actually be called the Eta-Theorem. In fact, Boltzmann himself (1872) used E (for entropy), and only later (S. H. Burbury, 1890) was the Roman letter H adopted (D. Flamm, private communication, and S. G. Brush, Kinetic Theory, Vol. 2, p. 6, Pergamon Press, Oxford, 1966).

444

9. The Boltzmann Equation

The contribution of the collision term I = d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )(f1 f2 − f3 f4 )(1 + log f1 ) (9.3.3) is found by making use of the invariance of W with respect to the exchanges 1, 3 ↔ 2, 4 and 1, 2 ↔ 3, 4 to be 1 f1 f2 d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )(f1 f2 −f3 f4 ) log I= . (9.3.4) 4 f3 f4 The rearrangement which leads from (9.3.3) to (9.3.4) is a special case of the general identity d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )(f1 f2 − f3 f4 )ϕ1 1 d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )× = 4 × (f1 f2 − f3 f4 )(ϕ1 + ϕ2 − ϕ3 − ϕ4 ) , (9.3.5) which follows from the symmetry relations (9.2.8), and where ϕi = ϕ(x, vi , t) (problem 9.1). From the inequality (x − y) log xy ≥ 0, it follows that I ≥0.

(9.3.6)

The time derivative of H, Eq. (9.3.2), can be written in the form ˙ H(x, t) = −∇x jH (x, t) − I , where

(9.3.7)

jH =

d3 v f log f v

(9.3.8)

is the associated current density. The ﬁrst term on the right-hand side of (9.3.7) gives the change in H due to the entropy ﬂow and the second gives its change due to entropy production. Discussion: a) If no external forces are present, F(x) = 0, then the simpliﬁed situation may occur that f (x, v, t) = f (v, t) is independent of x. Since the Boltzmann equation then contains no x-dependence, f remains independent of position for all times and it follows from (9.3.7), since ∇x jH (x, t) = 0, that H˙ = −I ≤ 0 .

(9.3.9)

9.3 Consequences of the Boltzmann Equation

445

The quantity H decreases and tends towards a minimum, which is ﬁnite, since the function f log f has a lower bound, and the integral over v exists.8 At the minimum, the equals sign holds in (9.3.9). In Sect. 9.3.3, we show that at the minimum, f becomes the Maxwell distribution f 0 (v) = n

m 3/2 mv2 e− 2kT . 2πkT

(9.3.10)

b) When F(x) = 0, and we are dealing with a closed system of volume V , then 3 d x ∇x jH (x, t) = dO jH (x, t) = 0 V

O(V )

holds. The ﬂux of H through the surface of this volume vanishes if the surface is an ideal reﬂector; then for each contribution v dO there is a corresponding contribution −v dO, and it follows that d d 3 Htot ≡ d xH(x, t) = − d3 xI ≤ 0 . (9.3.11) dt dt V V Htot decreases, we have irreversibility. The fact that irreversibility follows from an equation derived from Newtonian mechanics, which itself is time-reversal invariant, was met at ﬁrst with skepticism. However, the Stosszahlansatz contains a probabilistic element, as we will demonstrate in detail following Eq. (9.3.14). As already mentioned, H is closely connected with the entropy. The calculation of H for the equilibrium f0 (v) for an ideal gas yields distribution 3/2 m − 32 . The total entropy S of the (see problem 9.3) H = n log n 2πkT ideal gas (Eq. (2.7.27)) is thus

2π −1 . (9.3.12a) S = −V kH − kN 3 log m Here, is Planck’s quantum of action. Expressed locally, the relation between the entropy per unit volume, H, and the particle number density n is

2π S(x, t) = −kH(x, t) − k 3 log − 1 n(x, t) . (9.3.12b) m 8

OneR can readily convince oneself that H(t) cannot decrease without limit. Due to d3 v f (x, v, t) < ∞, f (x, v, t) is bounded everywhere and a divergence of H(t) could come only from the range of integration v → ∞. For v → R ∞, f → 0 must Rhold and as a result, log f → −∞. Comparison of H(t) = d3 v f log f with d3 v v 2 f (x, v, t) < ∞ shows that a divergence requires |log f | > v 2 . Then, 2 however, f < e−v , and H remains ﬁnite.

446

9. The Boltzmann Equation

The associated current densities are

2π − 1 j(x, t) jS (x, t) = −kjH (x, t) − k 3 log m

(9.3.12c)

and fulﬁll ˙ S(x, t) = −∇jS (x, t) + kI .

(9.3.12d)

Therefore, kI has the meaning of the local entropy production. ∗

9.3.2 Behavior of the Boltzmann Equation under Time Reversal

In a classical time-reversal transformation T (also motion reversal), the momenta (velocities) of the particles are reversed (v → −v)9 . Consider a system which, beginning with an initial state at the positions xn (0) and the velocities vn (0), evolves for a time t, to the state {xn (t), vn (t)}, then at time t1 experiences a motion-reversal transformation {xn (t1 ), vn (t1 )} → {xn (t1 ), −vn (t1 )}; then if the system is invariant with respect to time reversal, the further motion for time t1 will lead back to the motion-reversed initial state {xn (0), −vn (0)}. The solution of the equations of motion in the second time period (t > t1 ) is xn (t) = x(2t1 − t) vn (t)

(9.3.13)

= −v(2t1 − t) .

Here, we have assumed that no external magnetic ﬁeld is present. Apart from a translation by 2t1 , the replacement t → −t, v → −v is thus made. Under this transformation, the Boltzmann equation (9.2.7) becomes

∂ 1 + v∇x + F(x)∇v f (x, −v, −t) = −I [f (x, −v, −t)] . (9.3.14) ∂t m The notation of the collision term should indicate that all distribution functions have the time-reversed arguments. The Boltzmann equation is therefore not time-reversal invariant; f (x, −v, −t) is not a solution of the Boltzmann equation, but instead of an equation which has a negative sign on its righthand side (−I [f (x, −v, −t)])). The fact that an equation which was derived from Newtonian mechanics, which is time-reversal invariant, is itself not time-reversal invariant and exhibits irreversible behavior (Eq. (9.3.11)) may initially appear surprising. Historically, it was a source of controversy. In fact, the Stosszahlansatz contains a probabilistic element which goes beyond Newtonian mechanics. Even if one assumes uncorrelated particle numbers, the numbers of particles with 9

See e.g. QM II, Sect. 11.4.1

9.3 Consequences of the Boltzmann Equation

447

the velocities v and v2 will ﬂuctuate: they will sometimes be larger and sometimes smaller than would be expected from the single-particle distribution functions f1 and f2 . The most probable value of the collisions is f1 · f2 , and the time-averaged value of this number will in fact be f1 · f2 . The Boltzmann equation thus yields the typical evolution of typical conﬁgurations of the particle distribution. Conﬁgurations with small statistical weights, in which particles go from a (superﬁcially) probable conﬁguration to a less probable one (with lower entropy) – which is possible in Newtonian mechanics – are not described by the Boltzmann equation. We will consider these questions in more detail in the next chapter (Sect. 10.7), independently of the Boltzmann equation. 9.3.3 Collision Invariants and the Local Maxwell Distribution 9.3.3.1 Conserved Quantities The following conserved densities can be calculated from the single-particle distribution function: the particle-number density is given by n(x, t) ≡ d3 v f . (9.3.15a) The momentum density, which is also equal to the product of the mass and the current density, is given by m j(x, t) ≡ m n(x, t)u(x, t) ≡ m d3 v vf . (9.3.15b) Equation (9.3.15b) also deﬁnes the average local velocity u(x, t). Finally, we deﬁne the energy density, which is composed of the kinetic energy of the local convective ﬂow at the velocity u(x, t), i.e. n(x, t)mu(x, t)2 /2, together with the average kinetic energy in the local rest system10 , n(x, t)e(x, t):

m 2 u + φ2 f . 2 (9.3.15c) 3 Here, the relative velocity φ = v − u has been introduced, and d v φf = 0, which follows from Eq. (9.3.15b), has been used. For e(x, t), the internal energy per particle in the local rest system (which is moving at the velocity u(x, t)), it follows from (9.3.15c) that m n(x, t) e(x, t) = d3 v(v − u(x, t))2 f . (9.3.15c ) 2 mu(x, t)2 + e(x, t) ≡ n(x, t) 2

10

mv 2 d v f= 2 3

d3 v

We note that for a dilute gas, the potential energy is negligible relative to the kinetic energy, so that the internal energy per particle e(x, t) = ¯ (x, t) is equal to the average kinetic energy.

448

9. The Boltzmann Equation

9.3.3.2 Collisional Invariants The collision integral I of Eq. (9.3.3) and the collision term in the Boltzmann equation vanish if the distribution function f fulﬁlls the relation f1 f2 − f3 f4 = 0

(9.3.16)

for all possible collisions (restricted by the conservation laws contained in (9.2.8f), i.e. if log f1 + log f2 = log f3 + log f4

(9.3.17)

holds. Note that all the distribution functions fi have the same x-argument. Due to conservation of momentum, energy, and particle number, each of the ﬁve so called collisional invariants χi = mvi , χ4 = v ≡

i = 1, 2, 3 mv 2

(9.3.18a)

2

(9.3.18b)

χ5 = 1

(9.3.18c)

obeys the relation (9.3.17). There are no other collisional invariants apart from these ﬁve11 . Thus the logarithm of the most general distribution function for which the collision term vanishes is a linear combination of the collisional invariants with position-dependent prefactors: m log f (x, v, t) = α(x, t) + β(x, t) u(x, t) · mv − v2 , (9.3.19) 2 or

f (x, v, t) = n(x, t)

m 2πkT (x, t)

32

exp −

m (v − u(x, t))2 . 2kT (x, t) (9.3.19 )

32 2π Here, the quantities T (x, t) = (kβ(x, t))−1 , n(x, t) = mβ(x,t) exp α(x, t)

+β(x, t)mu2 (x, t)/2 and u(x, t) represent the local temperature, the local particle-number density, and the local velocity. One refers to f (x, v, t) as the local Maxwell distribution or the local equilibrium distribution function, since it is identical locally to the Maxwell distribution, (9.3.10) or (2.6.13). If we insert (9.3.19) into the expressions (9.3.15a–c) for the conserved quantities, we can see that the quantities n(x, t), u(x, t), and T (x, t) which occur on the right-hand side of (9.3.19 ) refer to the local density, velocity, and temperature, respectively, with the last quantity related to the mean kinetic energy via 11

H. Grad, Comm. Pure Appl. Math. 2, 331 (1949).

9.3 Consequences of the Boltzmann Equation

e(x, t) =

449

3 kT (x, t) , 2

i.e. by the caloric equation of state of an ideal gas. The local equilibrium distribution function f (x, v, t) is in general not a solution of the Boltzmann equation, since for it, only the collision term but not the ﬂow term vanishes12 . The local Maxwell distribution is in general a solution of the Boltzmann equation only when the coeﬃcients are constant, i.e. in global equilibrium. Together with the results from Sect. 9.3.1, it follows that a gas with an arbitrary inhomogeneous initial distribution f (x, v, 0) will ﬁnally relax into a Maxwell distribution (9.3.10) with a constant temperature and density. Their values are determined by the initial conditions. 9.3.4 Conservation Laws With the aid of the collisional invariants, we can derive equations of continuity for the conserved quantities from the Boltzmann equation. We ﬁrst relate the conserved densities (9.3.15a–c) to the collisional invariants (9.3.18a–c). The particle-number density, the momentum density, and the energy density can be represented in the following form: n(x, t) ≡ d3 v χ5 f , (9.3.20) m ji (x, t) ≡ m n(x, t)ui (x, t) =

d3 v χi f ,

(9.3.21)

d3 v χ4 f .

(9.3.22)

and

mu(x, t)2 + e(x, t) = n(x, t) 2

Next, we want to derive the equations of motion for these quantities from the Boltzmann equation (9.2.7) by multiplying the latter by χα (v) and integrating over v. Using the general identity (9.3.7), we ﬁnd ∂ 1 d3 v χα (v) + v∇x + F(x)∇v f (x, v, t) = 0 . (9.3.23) ∂t m By inserting χ5 , χ1,2,3 , and χ4 in that order, we obtain from (9.3.23) the following three conservation laws: 12

There are special local Maxwell distributions for which the ﬂow term likewise vanishes, but they have no physical relevance. See G. E. Uhlenbeck and G. W. Ford, Lectures in Statistical Mechanics, American Mathematical Society, Providence, 1963, p. 86; S. Harris, An Introduction to the Theory of the Boltzmann Equation, Holt Rinehart and Winston, New York, 1971, p. 73; and problem 9.16.

450

9. The Boltzmann Equation

Conservation of Particle Number: ∂ n + ∇j = 0 . ∂t

(9.3.24)

Conservation of Momentum: ∂ m ji + ∇xj d3 v m vj vi f − Fi (x)n(x) = 0 . ∂t

(9.3.25)

For the third term, an integration by parts was used. If we again employ the substitution v = u − φ in (9.3.25), we obtain m

∂ ∂ ji + (m n ui uj + Pji ) = nFi , ∂t ∂xj

where we have introduced the pressure tensor Pji = Pij = m d3 v φi φj f .

(9.3.25 )

(9.3.26)

Conservation of Energy: 2

Finally, setting χ4 = mv 2 in (9.3.23), we obtain ∂ m m d3 v v 2 f +∇xi d3 v (ui +φi ) (u2 +2uj φj +φ2 )f −j·F = 0 , (9.3.27) ∂t 2 2 where an integration by parts was used for the last term. Applying (9.3.22) and (9.3.26), we obtain the equation of continuity for the energy density m ∂ m 2 n u + e + ∇i nui u2 + e + uj Pji + qi = j · F . (9.3.28) ∂t 2 2 Here, along with the internal energy density e deﬁned in (9.3.15c ), we have also introduced the heat current density m φ2 f . q = d3 v φ (9.3.29) 2 Remarks: (i) (9.3.25 ) and (9.3.28) in the absence of external forces (F = 0) take on the usual form of equations of continuity, like (9.3.24). (ii) In the momentum density, according to Eq. (9.3.25 ), the tensorial current density is composed of a convective part and the pressure tensor Pij , which gives the microscopic momentum current in relation to the coordinate system moving at the average velocity u.

9.3 Consequences of the Boltzmann Equation

451

(iii) The energy current density in Eq. (9.3.28) contains a macroscopic convection current, the work which is performed by the pressure, and the heat current q (= mean energy ﬂux in the system which is moving with the liquid). (iv) The conservation laws do not form a complete system of equations as long as the current densities are unknown. In the hydrodynamic limit, it is possible to express the current densities in terms of the conserved quantities. The conservation laws for momentum and energy can also be written as equations for u and e. To this end, we employ the rearrangement ∂ ∂ ∂ ji + ∇j (nuj ui ) = n ui + ui n + ui ∇j nuj + nuj ∇j ui ∂t ∂t ∂t

∂ + uj ∇j ui =n ∂t

(9.3.30)

using (9.3.21) and the conservation law for the particle-number density (9.3.21), which yields for (9.3.25)

∂ mn + uj ∇j ui = −∇j Pji + nFi . (9.3.31) ∂t From this, taking the hydrodynamic limit, we obtain the Navier–Stokes equations. Likewise, starting from Eq. (9.3.28), we can show that

∂ n + uj ∇j e + ∇q = −Pij ∇i uj . (9.3.32) ∂t

9.3.5 Conservation Laws and Hydrodynamic Equations for the Local Maxwell Distribution 9.3.5.1 Local Equilibrium and Hydrodynamics In this section, we want to collect and explain some concepts which play a role in nonequilibrium theory. The term local equilibrium describes the situation in which the thermodynamic quantities of the system such as density, temperature, pressure, etc. can vary spatially and with time, but in each volume element the thermodynamic relations between the values which apply locally there are obeyed. The resulting dynamics are quite generally termed hydrodynamics in condensed-matter physics, in analogy to the dynamic equations which are valid in this limit for the ﬂow of gases and liquids. The conditions for local equilibrium are ωτ 1

and kl 1 ,

(9.3.33)

452

9. The Boltzmann Equation

where ω is the frequency of the time-dependent variations and k their wavenumber, τ is the collision time and l the mean free path. The ﬁrst condition guarantees that the variations with time are suﬃciently slow that the system has time to reach equilibrium locally through collisions of its atoms. The second condition presumes that the particles move along a distance l without changing their momenta and energies. The local values of momentum and energy must therefore in fact be constant over a distance l. Beginning with an arbitrary initial distribution function f (x, v, 0), according to the Boltzmann equation, the following relaxation processes occur: the collision term causes the distribution function to approach a local Maxwell distribution within the characteristic time τ . The ﬂow term causes an equalization in space, which requires a longer time. These two approaches towards equilibrium – in velocity space and in conﬁguration space – come to an end only when global equilibrium has been reached. If the system is subject only to perturbations which vary slowly in space and time, it will be in local equilibrium after the time τ . This temporally and spatially slowly varying distribution function will diﬀer from the local Maxwellian function (9.3.19 ), which does not obey the Boltzmann equation. 9.3.5.2 Hydrodynamic Equations without Dissipation In order to obtain explicit expressions for the current densities q and Pij , these quantities must be calculated for a distribution function f (x, v, t) which at least approximately obeys the Boltzmann equation. In this section, we will employ the local Maxwell distribution as an approximation. In Sect. 9.4, the Boltzmann equation will be solved systematically in a linear approximation. Following the preceding considerations concerning the diﬀerent relaxation behavior in conﬁguration space and in velocity space, we can expect that in local equilibrium, the actual distribution function will not be very diﬀerent from the local Maxwellian distribution. If we use the latter as an approximation, we will be neglecting dissipation. Using the local Maxwell distribution, Eq. (9.3.19 ), 32

2 m m (v − u(x, t)) , (9.3.34) exp − f = n(x, t) 2πkT (x, t) 2kT (x, t) with position- and time-dependent density n, temperature T , and ﬂow velocity u, we ﬁnd from (9.3.15a), (9.3.15b), and (9.3.15c ) j = nu 3 ne = nkT 2

(9.3.35)

Pij ≡

(9.3.37)

d3 v mφi φj f = δij nkT ≡ δij P ,

(9.3.36)

9.3 Consequences of the Boltzmann Equation

453

where the local pressure P was introduced; from (9.3.37), it is given by P = nkT .

(9.3.38)

The equations (9.3.38) and (9.3.36) express the local thermal and caloric equations of state of the ideal gas. The pressure tensor Pij contains no dissipative contribution which would correspond to the viscosity of the ﬂuid, as seen from Eq. (9.3.37). The heat current density (9.3.29) vanishes (q = 0) for the local Maxwell distribution. With these results, we obtain for the equations of continuity (9.3.24), (9.3.25), and (9.3.32)

∂ n = −∇nu ∂t

∂ + u∇ u = −∇P + nF ∂t

∂ + u∇ e = −P ∇u . n ∂t

mn

(9.3.39) (9.3.40) (9.3.41)

Here, (9.3.40) is Euler’s equation, well-known in hydrodynamics13 . The equations of motion (9.3.39)–(9.3.41) together with the local thermodynamic relations (9.3.36) and (9.3.38) represent a complete system of equations for n, u, and e. 9.3.5.3 Propagation of Sound in Gases As an application, we consider the propagation of sound. In this process, the gas undergoes small oscillations of its density n, its pressure P , its internal energy e, and its temperature T around their equilibrium values and around u = 0. In the following, we shall follow the convention that thermodynamic quantities for which no position or time dependence is given are taken to have their equilibrium values, that is we insert into Eqns. (9.3.39)–(9.3.41) n(x, t) = n + δn(x, t),

P (x, t) = P + δP (x, t),

e(x, t) = e + δe(x, t),

T (x, t) = T + δT (x, t)

(9.3.42)

and expand with respect to the small deviations indicated by δ: ∂ δn = −n∇u ∂t ∂ m n u = −∇δP ∂t ∂ n δe = −P ∇u . ∂t 13

(9.3.43a) (9.3.43b) (9.3.43c)

Euler’s equation describes nondissipative ﬂuid ﬂow; see L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics, Vol. IV: Hydrodynamics, Pergamon Press, Oxford 1960, p. 4.

454

9. The Boltzmann Equation

The ﬂow velocity u(x, t) ≡ δu(x, t) is small. Insertion of Eq. (9.3.36) and (9.3.38) into (9.3.43c) leads us to 3 ∂ δT = −T ∇u , 2 ∂t which, together with (9.3.43a), yields ∂ δn 3 δT − =0. ∂t n 2 T Comparison with the entropy of an ideal gas,

5 (2πmkT )3/2 S = kN + log , 2 nh3

(9.3.44)

(9.3.45)

shows that the time independence of S/N (i.e. of the entropy per particle or per unit mass) follows from (9.3.44). By applying ∂/∂t to (9.3.43a) and ∇ to (9.3.43b) and eliminating the term containing u, we obtain ∂ 2 δn = m−1 ∇2 δP . ∂t2

(9.3.46)

It follows from Eq. (9.3.38) that δP = nkδT + δnkT , ∂ ∂ and, together with (9.3.44), we obtain ∂t δP = 53 kT ∂t δn. With this, the equation of motion (9.3.46) can be brought into the form

∂ 2 δP 5kT 2 ∇ δP . = ∂t2 3m

(9.3.47)

The sound waves (pressure waves) which are described by the wave equation (9.3.47) have the form δP ∝ ei(kx±cs|k|t) with the adiabatic sound velocity 1 5kT . = cs = mnκS 3m

(9.3.48)

(9.3.49)

Here, κS is the adiabatic compressibility (Eq. (3.2.3b)), which according to Eq. (3.2.28) is given by κS =

3V 3 = 5P 5N kT

for an ideal gas.

(9.3.50)

∗

9.4 The Linearized Boltzmann Equation

455

Notes: The result that the entropy per particle S/N or the entropy per unit mass s for a sound wave is time-independent remains valid not only for an ideal gas but in general. If one takes the second derivative with respect to time of the following thermodynamic relation which is valid for local equilibrium14

∂n ∂n S , (9.3.51) δn = δP + δ ∂P S/N ∂S/N P N obtaining

∂ 2 δn ∂t2

=

∂n

∂2 P ∂P S/N ∂t2

+

∂n ∂S/N

∂ 2 S/N , then one obtains togther 2 P < ∂t => ? =0

with (9.3.43a) and (9.3.43b) the result

∂ 2 P (x, t) ∂P −1 = m ∇2 P (x, t) , ∂t2 ∂n S/N which again contains the adiabatic sound velocity

∂P ∂P c2s = m−1 = m−1 ∂n S/N ∂N/V S

∂P 1 = m−1 N −1 (−V 2 ) = . ∂V S m nκs

(9.3.52)

(9.3.53)

Following the third equals sign, the particle number N was taken to be ﬁxed. For local Maxwell distributions, the collision term vanishes; there is no damping. Between the regions of diﬀerent local equilibria, reversible oscillation processes take place. Deviations of the actual local equilibrium distribution functions f (x, v, t) from the local Maxwell distribution f l (x, v, t) lead as a result of the collision term to local, irreversible relaxation eﬀects and, together with the ﬂow term, to diﬀusion-like equalization processes which ﬁnally result in global equilibrium.

∗

9.4 The Linearized Boltzmann Equation

9.4.1 Linearization In this section, we want to investigate systematically the solutions of the Boltzmann equation in the limit of small deviations from equilibrium. The Boltzmann equation can be linearized and from its linearized form, the hydrodynamic equations can be derived. These are equations of motion for the conserved quantities, whose region of validity is at long wavelengths and 14

Within time and space derivatives, δn(x, t), etc. can be replaced by n(x, t) etc.

456

9. The Boltzmann Equation

low frequencies. It will occasionally be expedient to use the variables (k, ω) (wavenumber and frequency) instead of (x, t). We will also take an external potential, which vanishes for early times, into account: lim V (x, t) = 0 .

(9.4.1)

t→−∞

Then the distribution function is presumed to have the property lim f (x, v, t) = f 0 (v) ≡ n

t→−∞

m 32 mv2 e− 2kT , 2πkT

(9.4.2)

where f 0 is the global spatially uniform Maxwellian equilibrium distribution15 . For small deviations from global equilibrium, we can write f (x, v, t) in the form

1 0 f (x, v, t) = f (v) 1 + ν(x, v, t) ≡ f 0 + δf (9.4.3) kT and linearize the Boltzmann equation in δf or ν. The linearization of the collision term (9.2.6) yields

= − d3 v2 d3 v3 d3 v4 W (f10 f20 −f30 f40 +f10 δf2 +f20 δf1 −f30 δf4 −f40 δf3 ) coll 1 d3 v2 d3 v3 d3 v4 W (v v1 ; v3 v4 )f 0 (v1 )f 0 (v2 )(ν1 +ν2 −ν3 −ν4 ) , =− kT (9.4.4)

∂f ∂t

since f30 f40 = f10 f20 owing to energy conservation, which is contained in W (v v1 ; v3 v4 ). We also use the notation v1 ≡ v, f10 = f 0 (v) etc. The ﬂow term has the form

∂ 1 f0 0 + v∇x + F(x, t)∇v ν f + ∂t m kT f 0 (v) ∂ + v∇x ν(x, v, t) + v · ∇V (x, t) f 0 (v)/kT . (9.4.5) = kT ∂t All together, the linearized Boltzmann equation is given by: ∂ + v∇x ν(x, v, t) + v(∇V (x, t)) = −Lν ∂t

15

(9.4.6)

We write here the index which denotes an equilibrium distribution as an upper index, since later the notation fi0 ≡ f 0 (vi ) will also be employed.

∗

9.4 The Linearized Boltzmann Equation

457

with the linear collision operator L: kT d3 v2 d3 v3 d3 v4 W (v, v2 ; v3 , v4 )(ν + ν2 − ν3 − ν4 ) (9.4.7) Lν = 0 f (v) and W (v v2 ; v3 v4 ) =

1 1 0 f (v)f 0 (v2 )f 0 (v3 )f 0 (v4 )) 2 W (v v2 ; v3 v4 , (9.4.8) kT

where conservation of energy, contained in W , has been utilized. 9.4.2 The Scalar Product For our subsequent investigations, we introduce the scalar product of two functions ψ(v) and χ(v), f 0 (v) ψ|χ = d3 v ψ(v) χ(v) ; (9.4.9) kT it possesses the usual properties. The collisional invariants are special cases:

5 5 χ |χ ≡ 1|1 =

n f 0 (v) = , kT kT 4 5 ne 3 mv 2 f 0 (v) χ |χ ≡ |1 = d3 v = = n 2 kT kT 2

with ≡

mv 2 2

d3 v

(9.4.10a) (9.4.10b)

and

4 4 χ |χ ≡ | =

d3 v

mv 2 2

2

15 f 0 (v) = nkT . kT 4

(9.4.10c)

The collision operator L introduced in (9.4.7) is a linear operator, and obeys the relation 1 χ|Lν = d3 v1 d3 v2 d3 v3 d3 v4 W (v1 v2 ; v3 v4 ) 4 × (ν1 + ν2 − ν3 − ν4 )(χ1 + χ2 − χ3 − χ4 ) . (9.4.11) It follows from this that L is self-adjoint and positive semideﬁnite, χ|Lν = Lχ|ν , ν|Lν ≥ 0 .

(9.4.12) (9.4.13)

458

9. The Boltzmann Equation

9.4.3 Eigenfunctions of L and the Expansion of the Solutions of the Boltzmann Equation The eigenfunctions of L are denoted as χλ Lχλ = ωλ χλ ,

ωλ ≥ 0 .

(9.4.14)

The collisional invariants χ1 , χ2 , χ3 , χ4 , χ5 are eigenfunctions belonging to the eigenvalue 0. It will prove expedient to use orthonormalized eigenfunctions: χ ˆλ |χ ˆλ = δ λλ . (9.4.15) For the collisional invariants, this means the introduction of vi vi ˆui = χ ˆi ≡ χ = , i = 1, 2, 3 ; (9.4.16a) vi |vi n/m 1 d3 v v2 f 0 (v)/kT (here not summed over i) ; vi |vi = 3 1 1 ˆn = = ; and (9.4.16b) χ ˆ5 ≡ χ 1|1 n/kT − 3 kT 1|1 − 1 1| = 9 2 ˆT = 9 . χ ˆ4 ≡ χ 3 1|1 (1|1 | − 1|2 ) 2 nkT

(9.4.16c)

The eigenfunctions χλ with ωλ > 0 are orthogonal to the functions (9.4.16a– c) and in the case of degeneracy are orthonormalized among themselves. An arbitrary solution of the linearized Boltzmann equation can be represented as a superposition of the eigenfunctions of L with position- and time-dependent prefactors16 ν(x, v, t) = a5 (x, t)χ ˆn + a4 (x, t)χ ˆT + ai (x, t)χ ˆui +

∞

aλ (x, t)χ ˆλ . (9.4.17)

λ=6

Here, the notation indicates the particle-number density n(x, t), the temperature T (x, t), and the ﬂow velocity ui (x, t): f0 ν χ ˆT ≡ d3 v δf (x, v, t)χ ˆT kT δe − 32 kT δn 3n δT (x, t) . (9.4.18a) = 9 = 2kT 3 nkT

T Tˆ (x, t) ≡ a4 (x, t) = χ ˆ |ν =

d3 v

2

16

Here we assume that the eigenfunctions χλ form a complete basis. For the explicitly known eigenfunctions of the Maxwell potential (repulsive r −4 potential), this can be shown directly. For repulsive r −n potentials, completeness was proved by Y. Pao, Comm. Pure Appl. Math. 27, 407 (1974).

∗

9.4 The Linearized Boltzmann Equation

459

The identiﬁcation of δT (x, t) with local ﬂuctuations of the temperature, apart from the normalization factor, can be justiﬁed by considering the local internal energy e + δe =

3 (n + δn)k(T + δT ) , 2

from which, neglecting second-order quantities, it follows that δe =

3 3 nkδT + kT δn 2 2

⇒

δT =

δe − 32 δnkT . 3 2 nk

(9.4.19)

Similarly, we obtain for d3 v δf (x, v, t)

ˆn |ν = n ˆ (x, t) ≡ a5 (x, t) = χ

1

δn = , n/kT n/kT (9.4.18b)

and

vi d3 v δf (x, v, t) n/m vi nui (x, t) = d3 v (f 0 + δf ) = , i = 1, 2, 3 . n/m n/m

ˆui |ν = u ˆi (x, t) ≡ ai (x, t) = χ

(9.4.18c)

These expressions show the relations to the density and momentum ﬂuctuations. We now insert the expansion (9.4.17) into the linearized Boltzmann equation (9.4.6)

∞ ∂ + v∇ ν(x, v, t) = − aλ (x, t)ωλ χ ˆλ (v) − v∇V (x, t) . (9.4.20) ∂t λ =6

Only terms with λ ≥ 6 contribute to the sum, since the collisional invariants have the eigenvalue 0. Multiplying this equation by χˆλ f 0 (v)/kT and integrating over v, we obtain, using the orthonormalization of χ ˆλ from Eq. (9.4.15), ∞ ∂ λ a (x, t) + ∇ χ ˆλ |vχ ˆλ aλ (x, t) ∂t λ =1

λ ˆ |v ∇V (x, t) . (9.4.21) = −ωλ aλ (x, t) − χ

Fourier transformation d3 k dω i(k·x−ωt) λ λ e a (k, ω) a (x, t) = (2π)3 2π yields

(9.4.22)

460

9. The Boltzmann Equation ∞

(ω + iωλ )aλ (k, ω) − k

χ ˆλ |vχ ˆλ

λ aλ (k, ω) − k χ ˆ |v V (k, ω) = 0 .

λ =1

(9.4.23) Which quantities couple to each other depends on the scalar products λ λ χ ˆ |vχ ˆλ clearly plays a role. ˆ , whereby the symmetry of the χ Since ωλ = 0 for the modes λ = 1 to 5, i.e. momentum, energy, and particle-number density, the structure of the conservation laws for these quantities in (9.4.23) can already be recognized at this stage. The term containing the external force obviously couples only to χ ˆi ≡ χ ˆui for reasons of symmetry i j i j v |v (9.4.24) = n/m δ ij . χ ˆ |v = n/m For the modes with λ ≤ 5, ωaλ (k, ω) − k

∞

χ ˆλ |vχ ˆλ

λ aλ (k, ω) − k χ ˆ |v V (k, ω) = 0

(9.4.25)

λ =1

holds, and for the non-conserved degrees of freedom17 λ ≥ 6, we have ki a (k, ω) = ω + iωλ λ

+

5

∞

χ ˆλ |vi χ ˆλ

λ =1

χ ˆλ |vi χ ˆλ

aλ (k, ω)

λ aλ (k, ω) + χ ˆ |vi V (k, ω) . (9.4.26)

λ =6

This diﬀerence, which results from the diﬀerent time scales, forms the basis for the elimination of the non-conserved degrees of freedom. 9.4.4 The Hydrodynamic Limit For low frequencies (ω ω λ ) and (vk ω λ ), aλ (k, ω) with λ ≥ 6 is of higher order in these quantities than are the conserved quantities λ = 1, . . . , 5. Therefore, in leading order we can write for (9.4.26) ) 5 * λ iki λ λ λ λ a (k, ω) = − χ ˆ |vi χ a (k, ω) + χ ˆ ˆ |vi V (k, ω) . (9.4.27) ωλ λ =1

Inserting this into (9.4.25) for the conserved (also called the hydrodynamic) variables, we ﬁnd 17

Here, the Einstein summation convention is employed: repeated indices i, j, l, r are to be summed over from 1 to 3.

∗

5

ωaλ (k, ω) − ki

9.4 The Linearized Boltzmann Equation

χ ˆλ |vi χ ˆλ

461

aλ (k, ω)

λ =1 5 ∞ λ 1 µ λ χ ˆ |vj χ + iki kj χ ˆ |vi χ ˆ |vi V (k, ω) ˆµ ˆλ aλ (k, ω) − ki χ ω µ λ =1 µ=6 ∞ −ik j χ ˆλ |vj V (k, ω) = 0 ; (9.4.28) χ ˆλ |vi χ ˆλ − ki ω λ λ =6

this is a closed system of hydrodynamic equations of motion. The second term in these equations leads to motions which propagate like sound waves, the third term to damping of these oscillations. The latter results formally from the elimination of the inﬁnite number of non-conserved variables which was possible due to the separation of the time scales of the hydrodynamic variables (typical frequency ck, Dk 2 ) from the that of the non-conserved variables (typical frequency ωµ ∝ τ −1 ). The structure which is visible in Eq. (9.4.28) is of a very general nature and can be derived from the Boltzmann equations for other physical systems, such as phonons and electrons or magnons in solids. Now we want to further evaluate Eq. (9.4.28) for a dilute gas without the eﬀect of an external potential. We ﬁrst compute the scalar products in the second term (see Eqns. (9.4.16a–c)) 0 n kT vi vj j 3 f (v) χ ˆ |vi χ (9.4.29a) ˆ = d v = δij 2 kT m n /kT m 2 mv 3 0 T 2 − 2 kT 2kT j 3 f (v) vi vj 9 . (9.4.29b) χ ˆ |vi χ ˆ = d v = δij kT 3m n 3 nkT m2

j n,T These scalar products and χ ˆ |vi χ ˆ |vi χ ˆn,T = χ ˆj are the only ﬁnite scalar products which result from the ﬂow term in the equation of motion. We now proceed to analyze the equations of motion for the particlenumber density, the energy density, and the velocity. In the equation of motion for the particle-number density, λ ≡ 5 (9.4.28), there is a coupling to ai (k, ω) due to the second term. As noted above, all the other scalar products vanish. The third term vanishes completely, since χ ˆn |vi χ ˆµ ∝ vi |χ ˆµ = 0 for µ ≥ 6 owing to the orthonormalization. We thus ﬁnd kT i ωˆ n(k, ω) − ki u ˆ (k, ω) = 0 , (9.4.30) m or, due to (9.4.18), ωδn(k, ω) − ki nui (k, ω) = 0 , or in real space

(9.4.30 )

462

9. The Boltzmann Equation

∂ n(x, t) + ∇nu(x, t) = 0 . ∂t

(9.4.30 )

This equation of motion is identical with the equation of continuity for the density, (9.3.24), except that here, n(x, t) in the gradient term is replaced by n because of the linearization. The equation of motion for the local temperature, making use of (9.4.28), (9.4.18a), and (9.4.29b), can be cast in the form ω

3n kδT (k, ω) − ki 2kT

2kT nui (k, ω) 3m n/m 5 ∞ 1 µ + iki kj χ ˆ |vj χ χ ˆ4 |vi χ ˆµ ˆλ aλ (k, ω) = 0 . (9.4.31) ωµ µ=6 λ =1

µ In the sum over λ , the term λ = 5 makes no contribution, since χ ˆ |vj χ ˆ5 ∝ χ ˆµ |vj = 0. Due to the fact that χ ˆ4 transforms as a scalar, χ ˆµ must transform λ i ˆ =χ ˆ also makes no contribution, like vi , so that due to the second factor, χ leaving only χ ˆλ = χ ˆ4 . Finally, only the following expression remains from the third term of Eq. (9.4.31): iki kj

∞ 4 1 µ χ ˆ |vi χ χ ˆ |vj χ ˆµ ˆ4 a4 (k, ω) ωµ µ=6

≈ iki kj τ

∞ 4 µ χ ˆ |vi χ ˆ |vj χ ˆµ χ ˆ4 a4 (k, ω) µ=6

5 4 λ χ ˆ |vi χ ˆ |vj χ ˆ4 |vi vj χ ˆ4 − ˆλ χ ˆ4 a4 (k, ω) = iki kj τ χ

= iki kj τ

λ=1

4 i ˆ |vi χ ˆ |vj χ χ ˆ4 |vi vj χ ˆ4 − χ ˆi χ ˆ4 a4 (k, ω) . (9.4.32)

In this expression, all the ωµ−1 were replaced by the collision time, ωµ−1 = τ , and we have employed the completeness relation for the eigenfunctions of L as well as the symmetry properties. We now have 4 2kT i χ ˆ |vi χ , (9.4.33a) ˆ = 3m where here, we do not sum over i, and 4 1 χ ˆ |vi vj χ ˆ4 = δij 3

d3 v f 0 (v) v2

mv2 2

2

−

mv2 2 3kT 3 2 2 n(kT )

+

= δij

3

2 kT

2

7kT . (9.4.33b) 3m

∗

9.4 The Linearized Boltzmann Equation

463

Thus the third term in Eq. (9.4.31) becomes ik 2 D 3n/2kT kδT , with the coeﬃcient D≡

5 kT τ κ = , 3 m mcv

(9.4.34)

3 nk 2

(9.4.35)

where cv =

is the speciﬁc heat at constant volume, and κ=

5 2 nk T τ 2

(9.4.36)

refers to the heat conductivity. All together, using (9.4.32)–(9.4.34), we obtain for the equation of motion (9.4.31) of the local temperature 2kT i 4 ωa (k, ω) − ki a (k, ω) + ik 2 Da4 (k, ω) = 0 , (9.4.37) 3m or ωδT −

2T k · nu + ik 2 DδT = 0 , 3n

(9.4.37 )

or in real space, ∂ 2T T (x, t) + ∇nu(x, t) − D∇2 T (x, t) = 0 . ∂t 3n

(9.4.37 )

Connection with phenomenological considerations: The time variation of the quantity of heat δQ is δ Q˙ = −∇jQ

(9.4.38a)

with the heat current density jQ . In local equilibrium, the thermodynamic relation δQ = cP δT

(9.4.38b)

holds. Here, the speciﬁc heat at constant pressure appears, because heat diﬀusion is isobaric owing to cs k Ds k2 in the limit of small wavenumbers with the velocity of sound cs and the thermal diﬀusion constant Ds . The heat current ﬂows in the direction of decreasing temperature, which implies jQ = −

κ ∇T m

(9.4.38c)

with the thermal conductivity κ. Overall, we thus obtain κ d T = ∇2 T , dt mcP a diﬀusion equation for the temperature.

(9.4.38d)

464

9. The Boltzmann Equation

Finally, we determine the equation of motion of the momentum density, i.e. for aj , j = 1, 2, 3. For the reversible terms (the ﬁrst and second terms

ˆj = 0 the in Eq. (9.4.28)), we ﬁnd by employing (9.4.18b–c) and χ ˆj |vi χ result j j ωaj (k, ω) − ki χ ˆ |vi χ ˆ5 a5 (k, ω) + χ ˆ |vi χ ˆ4 a4 (k, ω)

m kT n ωnuj (k, ω) − kj δn(k, ω) − kj kδT (k, ω) = (9.4.39) n m m

m 1 ωnuj (k, ω) − kj δP (k, ω) , = n m where, from P (x, t) = n(x, t)kT (x, t) = n + δn(x, t) k T + δT (x, t) , it follows that δP = nkδT + kT δn , which was used above. For the damping term in the equation of motion of the momentum density, we obtain from (9.4.28) using the approximation ωµ = 1/τ the result: 1 µ χ ˆ |vl χ ˆλ aλ (k, ω) ωµ λ =1 µ=6 ∞ vr vj µ 1 µ χ ˆ vl ˆ vi χ ar (k, ω) = iki kl ω n/m n/m µ µ=6

v v j vi vl r ≈ iki kl τ − n/m n/m 5 vr vj λ χ ˆλ vl ˆ vi χ ar (k, ω) . n/m n/m λ=1

iki kl

5 ∞

χ ˆj |vi χ ˆµ

(9.4.40)

In the second line, we have used the fact that the sum over λ reduces to r = 1, 2, 3. For the ﬁrst term in the curved brackets we obtain: v v m j vi vl r d3 v f 0 (v)vj vi vl vr = nkT n/m n/m kT (δji δlr + δjl δir + δjr δil ) . = m For the second term in the curved brackets in (9.4.40), we need the results of problem 9.12, leading to δij δlr 5kT 3m . As a result, the overall damping term (9.4.40) is given by

∗

9.4 The Linearized Boltzmann Equation

465

5 ∞ j 1 µ χ ˆ |vl χ χ ˆ |vi χ ˆµ ˆλ aλ (k, ω) ωµ λ =1 µ=6

kT 5 δji δlr + δjl δir + δjr δil − δij δlr ar (k, ω) = iki kl τ m 3 (9.4.40 )

n 2 + ki kj ui (k, ω) + ki ki uj (k, ω) τ kT = i kj kl ul (k, ω) − 3 m

n 1 2 kj k · u(k, ω) + k uj (k, ω) τ kT . =i 3 m

iki kl

Deﬁning the shear viscosity as η ≡ nτ kT ,

(9.4.41)

we ﬁnd with (9.4.39) and (9.4.40 ) the following equivalent forms of the equation of motion for the momentum density: ωnuj (k, ω) −

1 η kj δP (k, ω) + i m m

1 kj ku(k, ω) + k2 uj (k, ω) = 0 , 3 (9.4.42)

or, in terms of space and time,

1 ∂ 2 mnuj (x, t)+∇j P (x, t)−η ∇j ∇ · u(x, t) + ∇ uj (x, t) = 0 (9.4.42 ) ∂t 3 or ∂ mnuj (x, t) + Pjk,k (x, t) = 0 ∂t

(9.4.42 )

with the pressure tensor (Pjk,k ≡ ∇k Pjk , etc.)

2 Pjk (x, t) = δjk P (x, t)−η uj,k (x, t) + uk,j (x, t) − δjk ul,l (x, t) . (9.4.43) 3 We can compare this result with the general pressure tensor of hydrodynamics:

2 Pjk (x, t) = δjk P (x, t) − η uj,k (x, t) + uk,j (x, t) − δjk ul,l (x, t) − 3 − ζδjk ul,l (x, t) . (9.4.44) Here, ζ is the bulk viscosity, also called the compressional viscosity. As a result of Eq. (9.4.44), the bulk viscosity vanishes according to the Boltzmann equation for simple monatomic gases. The expression (9.4.41) for the viscosity can also be written in the following form (see Eqns. (9.2.12) and (9.2.13)):

466

9. The Boltzmann Equation

η = τ nkT = τ n

2 1 mvth mvth = nmvth l = , 3 3 3σtot

(9.4.45)

where vth = 3kT /m is the thermal velocity from the Maxwell distribution; i.e. the viscosity is independent of the density. It is instructive to write the hydrodynamic equations in terms of the normalized functions n ˆ = √ n2 , etc. instead of the usual quantities n(x, t), n /kT

T (x, t), ui (x, t). From Eqns. (9.4.30), (9.4.37), and (9.4.42) it follows that n ˆ˙ (x, t) = −cn ∇i u ˆi (x, t) (9.4.46a) ˙ˆ 2ˆ i T (x, t) = −cT ∇i u ˆ (x, t) + D∇ T (x, t) (9.4.46b) η η ˆ) ∇2 u ∇i (∇ · u ˆ − cT ∇i Tˆ + ˆi + (9.4.46c) u ˆ˙ i (x, t) = −cn ∇i n mn 3mn kT /m, cT = 2kT /3m, D and η from with the coeﬃcients cn = Eqns. (9.4.34) and (9.4.41). Note that with the orthonormalized quantities, the coupling of the degrees of freedom in the equations of motion is symmetric. 9.4.5 Solutions of the Hydrodynamic Equations The periodic solutions of (9.4.46a–c), which can be found using the ansatz n ˆ (x, t) ∝ u ˆi (x, t) ∝ Tˆ (x, t) ∝ ei(kx−ωt) , are particularly interesting. The acoustic resonances which follow from the resulting secular determinant and the thermal diﬀusion modes have the frequencies i ω = ±cs k − Ds k 2 2 ω = −iDT k 2

(9.4.47a) (9.4.47b)

with the sound velocity cs , the acoustic attenuation constant Ds , and the heat diﬀusion constant (thermal diﬀusivity) DT 9 5 kT 1 2 ≡√ cs = c2n + cT = (9.4.48a) 3 m mnκs

κ 1 4η 1 + (9.4.48b) − Ds = 3mn mn cv cP cv κ DT = D = . (9.4.48c) cP mcP In this case, the speciﬁc heat at constant pressure enters; for an ideal gas, it is given by cP =

5 nk . 2

(9.4.49)

∗

9.4 The Linearized Boltzmann Equation

467

The two transverse components of the momentum density undergo a purely diﬀusive shearing motion: Dη =

ηk 2 . mn

(9.4.50)

The resonances (9.4.47a,b) express themselves for example in the densitydensity correlation function, Snn (k, ω). The calculation of dynamic susceptibilities and correlation functions (problem 9.11) starting from equations of motion with damping terms is described in QM II, Sect. 4.7. The coupled system of hydrodynamic equations of motion for the density, the temperature, and the longitudinal momentum density yields the density-density correlation function:

∂n Snn (k, ω) = 2kT n ∂P T ⎧ ⎫ cv 2⎬ ⎨ ccv (cs k)2 Ds k 2 + 1 − ccv (ω 2 − c2s k 2 )DT k 2 D 1 − k T cP P P × + 2 . ⎩ (ω 2 − c2s k 2 )2 + (ωDs k 2 )2 ω + (DT k 2 )2 ⎭ (9.4.51) The density-density correlation function for ﬁxed k is shown schematically as a function of ω in Fig. 9.3.

Fig. 9.3. The densitydensity correlation function for ﬁxed k as a function of ω

The positions of the resonances are determined by the real parts and their widths by the imaginary parts of the frequencies (9.4.47a, b). In addition to the two resonances representing longitudinal acoustic phonons at ±cs k, one ﬁnds a resonance at ω = 0 related to heat diﬀusion. The area below the curve shown in Fig. 9.3, which determines the overall intensity in inelastic scattering ∂n experiments, is proportional to the isothermal compressibility ∂P . The T relative strength of the diﬀusion compared to the two acoustic resonances V is given by the ratio of the speciﬁc heats, cPc−c . This ratio is also called V the Landau–Placzek ratio, and the diﬀusive resonance in Snn (k, ω) is the Landau–Placzek peak.

468

9. The Boltzmann Equation

Since the speciﬁc heat at constant pressure diverges as (T − Tc )−γ , while that at constant volume diverges only as (T − Tc )−α (p. 256, p. 255), this ratio becomes increasingly large on approaching Tc . The expression (9.4.51), valid in the limit of small k (scattering in the forward direction), exhibits the phenomenon of critical opalescence, as a result of (∂n/∂P )T ∝ (T − Tc )−γ . ∗

9.5 Supplementary Remarks

9.5.1 Relaxation-Time Approximation The general evaluation of the eigenvalues and eigenfunctions of the linear collision operator is complicated. On the other hand, since not all the eigenfunctions contribute to a particular diﬀusion process and certainly the ones with the largest weight are those whose eigenvalues ωλ are especially small, we can as an approximation attempt to characterize the collision term through only one characteristic frequency,

∂ 1 + v∇ f (x, v, t) = − (f (x, v, t) − f (x, v, t)) . (9.5.1) ∂t τ This approximation is called the conserved relaxation time approximation, since the right-hand side represents the diﬀerence between the distribution function and a local Maxwell distribution. This takes into account the fact that the collision term vanishes when the distribution function is equal to the local Maxwell distribution. The local quantities n(x, t), ui (x, t) and e(x, t) which occur in f (x, v, t) can be calculated from f (x, v, t) using Eqns. (9.3.15a), (9.3.15b), and (9.3.15c ). Our goal is now to calculate f or f − f . We write

∂ ∂ 1 + v∇ (f − f ) + + v∇ f = − (f − f ) . (9.5.2) ∂t ∂t τ In the hydrodynamic region, ωτ 1, vkτ 1, we can neglect the ﬁrst term on the left-handside of (9.5.2) compared to the term on the right side, ∂ obtaining f − f = τ ∂t + v∇ f . Therefore, the distribution function has the form

∂ + v∇ f , (9.5.3) f =f +τ ∂t

and, using this result, one can again calculate the current densities, in an extension of Sect. 9.3.5.2. In zeroth order, we obtain the expressions found in (9.3.35) and (9.3.36) for the reversible parts of the pressure tensor and the remaining current densities. The second term gives additional contributions to the pressure tensor, and also yields a ﬁnite heat current. Since f depends

∗

9.5 Supplementary Remarks

469

on x and t only through the three functions n(x, t), T (x, t), and u(x, t), the second term depends on these and their derivatives. The time derivatives of f or n, T , and u can be replaced by the zero-oder equations of motion. The corrections therefore are of the form ∇n(x, t), ∇T (x, t), and ∇ui (x, t). Along with the derivatives of Pij and q which already occur in the equations of motion, the additional terms in the equations are of the type τ ∇2 T (x, t) etc. (See problem 9.13). 9.5.2 Calculation of W (v1 , v2 ; v1 , v2 ) The general results of the Boltzmann equation did not depend on the precise form of the collision probability, but instead only the general relations (9.2.8a–f) were required. For completeness, we give the relation between W (v1 , v2 ; v1 , v2 ) and the scattering cross-section for two particles18 . It is assumed that the two colliding particles interact via a central potential w(x1 − x2 ). We treat the scattering process v1 , v2 ⇒ v1 , v2 , in which particles 1 and 2, with velocities v1 and v2 before the collision, are left with the velocities v1 and v2 following the collision (see Fig. 9.4). The conservation laws for momentum and energy apply; owing to the equality of the two masses, they are given by v1 + v2 = v1 + v2 v12

+

v22

=

2 v1

+

2 v2

(9.5.4a) .

(9.5.4b)

Fig. 9.4. The collision of two particles 18

The theory of scattering in classical mechanics is given for example in L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics, Vol. I: Mechanics, 3rd Ed. (Butterworth–Heinemann, London 1976), or H. Goldstein, Classical Mechanics, 2nd Ed. (Addison–Wesley, New York 1980).

470

9. The Boltzmann Equation

It is expedient to introduce the center-of-mass and relative velocities; before the collision, they are V=

1 (v1 + v2 ) , 2

u = v1 − v2 ,

(9.5.5a)

u = v1 − v2 .

(9.5.5b)

and after the collision, V =

1 (v + v2 ) , 2 1

Expressed in terms of these velocities, the two conservation laws have the form V = V

(9.5.6a)

|u| = |u | .

(9.5.6b)

and

In order to recognize the validity of (9.5.6b), one need only subtract the square of (9.5.4a) from two times Eq. (9.5.4b). The center-of-mass velocity does not change as a result of the collision, and the (asymptotic) relative velocity does not change its magnitude, but it is rotated in space. For the velocity transformations to the center-of-mass frame before and after the collision given in (9.5.5a) and (9.5.5b), the volume elements in velocity space obey the relations d3 v1 d3 v2 = d3 V d3 u = d3 V d3 u = d3 v1 d3 v2

(9.5.7)

due to the fact that the Jacobians have unit value. The scattering cross-section can be most simply computed in the centerof-mass frame. As is known from classical mechanics,18 the relative coordinate x obeys an equation of motion in which the mass takes the form of a reduced mass µ (here µ = 12 m) and the potential enters as a central potential w(x). Hence, one obtains the scattering cross-section in the center-of-mass frame from the scattering of a ﬁctitious particle of mass µ by the potential w(x). We ﬁrst write down the velocities of the two particles in the center-of-mass frame before and after the collision v1s = v1 − V =

1 u, 2

1 v2s = − u , 2

v1s =

1 u , 2

1 v2s = − u . (9.5.8) 2

We now recall some concepts from scattering theory. The equivalent potential scattering problem is represented in Fig. 9.5, and we can use it to deﬁne the scattering cross-section. The orbital plane of the particle is determined by the asymptotic incoming velocity u and position of the scattering center O. This follows from the conservation of angular momentum in the central potential. The z-axis of the coordinate system drawn in Fig. 9.5 passes

∗

9.5 Supplementary Remarks

471

Fig. 9.5. Scattering by a ﬁxed potential, with collision parameter s and scattering center O. The particles which impinge on the surface element s ds dϕ are deﬂected into the solid angle element dΩ

through the scattering center O and is taken to be parallel to u. The orbit of the incoming particle is determined by the collision parameter s and the angle ϕ. In Fig. 9.5, the orbital plane which is deﬁned by the angle ϕ lies in the plane of the page. We consider a uniform beam of particles arriving at various distances s from the axis with the asymptotic incoming velocity u. The intensity I of this beam is deﬁned as the number of particles which impinge per second on one cm2 of the perpendicular surface shown. Letting n be the number of particles per cm3 , then I = n|u|. The particles which impinge upon the surface element deﬁned by the collision parameters s and s + ds and the diﬀerential element of angle dϕ are deﬂected into the solidangle element dΩ. The number of particles arriving in dΩ per unit time is denoted by dN (Ω). The diﬀerential scattering cross-section σ(Ω, u), which of course also depends upon u, is deﬁned by dN (Ω) = Iσ(Ω, u)dΩ, or σ(Ω, u) = I −1

dN (Ω) . dΩ

(9.5.9)

Owing to the cylinder symmetry of the beam around the z-axis, σ(Ω, u) = σ(ϑ, u) is independent of ϕ. The scattering cross-section in the center-of-mass system is obtained by making the replacement u = |v1 − v2 |. The collision parameter s uniquely determines the orbital curve, and therefore the scattering angle: dN (Ω) = Isdϕ(−ds) .

(9.5.10)

From this it follows using dΩ = sin ϑdϑdϕ that σ(Ω, u) = −

1 ds 1 1 ds2 s =− . sin ϑ dϑ sin ϑ 2 dϑ

(9.5.11)

From ϑ(s) or s(ϑ), we obtain the scattering cross-section. The scattering angle ϑ and the asymptotic angle ϕa are related by ϑ = π − 2ϕa (cf. Fig. 9.6).

or ϕa =

1 (π − ϑ) 2

(9.5.12)

472

9. The Boltzmann Equation

Fig. 9.6. The scattering angle (deﬂection angle) ϑ and the asymptotic angle ϕa

In classical mechanics, the conservation laws for energy and angular momentum give ∞ ∞ l s ϕa = dr 9 dr 9 ; l2 = 2 rmin rmin r2 1 − rs2 − 2w(r) r2 2µ E − w(r) − r2 µu2 (9.5.13) here, we use l = µsu

(9.5.14a)

to denote the angular momentum and µ (9.5.14b) E = u2 2 for the energy, expressed in terms of the asymptotic velocity. The distance rmin of closest approach to the scattering center is determined from the condition (r˙ = 0): w(rmin ) +

l2 =E. 2 2µrmin

(9.5.14c)

As an example, we consider the scattering of two hard spheres of radius R. In this case, we have

ϑ ϑ π s = 2R sin ϕa = 2R sin − = 2R cos , 2 2 2 from which, using (9.5.11), we ﬁnd σ(ϑ, u) = R2 .

(9.5.15)

In this case, the scattering cross-section is independent of the deﬂection angle and of u, which is otherwise not the case, as is known for example from Rutherford scattering18 . After this excursion into classical mechanics, we are in a position to calculate the transition probability W (v, v2 ; v3 , v4 ) for the loss and gain processes in Eqns. (9.2.5) and (9.2.6). To calculate the loss rate, we recall the following assumptions: (i) The forces are assumed to be short-ranged, so that only particles within the same volume element d3 x1 will scatter each other.

∗

9.5 Supplementary Remarks

473

(ii) When particle 1 is scattered, it leaves the velocity element d3 v1 . To calculate the loss rate l, we pick out a molecule in d3 x which has the velocity v1 and take it to be the scattering center on which molecule 2 with velocity v2 in the velocity element d3 v2 impinges. The ﬂux of such particles is f (x, v2 , t)|v2 −v1 |d3 v2 . The number of particles which impinge on the surface element (−s ds)dϕ per unit time is f (x, v2 , t)|v2 − v1 |d3 v2 (−s ds)dϕ = = f (x, v2 , t)|v2 − v1 |d3 v2 σ(Ω, |v1 − v2 |)dΩ . In order to obtain the number of collisions which the particles within d3 xd3 v1 experience in the time interval dt, we have to multiply this result by f (x, v1 , t)d3 xd3 v1 dt and then integrate over v2 and all deﬂection angles dΩ:

ld xd v1 dt = 3

3

3

d v2

dΩf (x, v1 , t)f (x, v2 , t)|v2 − v1 | × × σ Ω, |v1 − v2 | d3 xd3 v1 dt .

(9.5.16)

To calculate the gain rate g, we consider scattering processes in which a molecule of given velocity v1 is scattered into a state with velocity v1 by a collision with some other molecule: gd3 xd3 v1 dt = dΩ d3 v1 d3 v2 |v1 − v2 | σ Ω, |v1 − v2 | × × f (x, v1 , t)f (x, v2 , t)d3 xdt . (9.5.17) The limits of the velocity integrals are chosen so that the velocity v1 lies within the element d3 v1 . Using (9.5.7), we obtain for the right side of (9.5.17) 3 3 d v1 d v2 dΩ|v1 − v2 | σ Ω, |v1 − v2 | f (x, v1 , t)f (x, v2 , t)d3 xdt , i.e.

g=

3

d v2

dΩ|v1 − v2 | σ Ω, |v1 − v2 | f (x, v1 , t)f (x, v2 , t) . (9.5.18)

Here, we have also taken account of the fact that the scattering cross-section for the scattering of v1 , v2 → v1 , v2 is equal to that for v1 , v2 → v1 , v2 , since the two events can be transformed into one another by a reﬂection in space and time. As a result, we ﬁnd for the total collision term: ∂f = g−l = d3 v2 dΩ |v2 − v1 | σ Ω, |v2 − v1 | f1 f2 −f1 f2 . (9.5.19) ∂t coll

474

9. The Boltzmann Equation

The deﬂection angle ϑ can be expressed as follows in terms of the asymptotic relative velocities18 : ϑ = arccos

(v1 − v2 )(v1 − v2 ) . |v1 − v2 ||v1 − v2 |

The integral dΩ refers to an integration over the direction of u . With the rearrangements u − u2 = v1 − 2v1 v2 + v2 − v12 + 2v1 v2 − v22 2

2

2

= −4V + 2v1 + 2v2 + 4V2 − 2v12 − 2v22 = 2(v1 + v2 − v12 − v22 ) 2

2

2

2

2

and

dΩ |v2 − v1 | = dΩ u = du dΩ δ(u − u)u * ) 2 u2 u 2 − = du u dΩ δ 2 2 * ) 2 u2 u d3 V δ (3) (V − V) = d3 u δ − 2 2 * ) 2 2 v1 2 + v2 2 v1 + v2 3 3 − δ (3) (v1 + v2 − v1 − v2 ) , = 4 d v1 d v2 δ 2 2

which also imply the conservation laws, we obtain g − l = d3 v2 d3 v1 d3 v2 W (v1 , v2 ; v1 , v2 )(f1 f2 − f1 f2 ) .

(9.5.20)

In this expression, we use ) W (v1 , v2 ; v1 , v2 )

= 4σ(Ω, |v2 − v1 |)δ

v1 2 + v2 2 v1 + v2 − 2 2 2

2

* ×

× δ (3) (v1 + v2 − v1 − v2 ) . (9.5.21) Comparison with Eq. (9.2.8f) yields σ(v1 , v2 ; v1 , v2 ) = 4m4 σ(Ω, |v2 − v1 |) .

(9.5.22)

From the loss term in (9.5.19), we can read oﬀ the total scattering rate for particles of velocity v1 : 1 3 = d v2 dΩ |v2 − v1 | σ Ω, |v2 − v1 | f (x, v2 , t) . (9.5.23) τ (x, v, t)

∗

9.5 Supplementary Remarks

475

The expression for τ −1 corresponds to the estimate in Eq. (9.2.12), which was derived by elementary considerations: τ −1 = nvth σtot , with rmax σtot = dΩσ Ω, |v2 − v1 | = 2π ds s . (9.5.24) 0

rmax is the distance from the scattering center for which the scattering angle goes to zero, i.e. for which no more scattering occurs. In the case of hard spheres, from Eq. (9.5.15) we have σtot = 4πR2 .

(9.5.25)

For potentials with inﬁnite range, rmax diverges. In this case, the collision term has the form ∞ 2π ∂f 3 = d v2 ds s dϕ(f1 f2 − f1 f2 )|v1 − v2 | . (9.5.26) ∂t coll 0 0 Although the individual contributions to the collision term diverge, the overall term remains ﬁnite: rmax lim ds s (f1 f2 − f1 f2 ) = ﬁnite , rmax →∞

0

since for s → ∞, the deﬂection angle tends to 0, and v1 − v1 → 0 and v2 − v2 → 0, so that (f1 f2 − f1 f2 ) → 0 .

Literature P. R´esibois and M. De Leener, Classical Kinetic Theory of Fluids (John Wiley, New York, 1977). K. Huang, Statistical Mechanics, 2nd Ed. (John Wiley, New York, 1987). L. Boltzmann, Vorlesungen u ¨ber Gastheorie, Vol. 1: Theorie der Gase mit einatomigen Molek¨ ulen, deren Dimensionen gegen die mittlere Wegl¨ ange verschwinden (Barth, Leipzig, 1896); or Lectures on Gas Theory, transl. by S. Brush, University of California Press, Berkeley 1964. R. L. Liboﬀ, Introduction to the Theory of Kinetic Equations, Robert E. Krieger publishing Co., Huntington, New York 1975. S. Harris, An Introduction to the Theory of the Boltzmann Equation, Holt, Rinehart and Winston, New York 1971. J. A. McLennan, Introduction to Non-Equilibrium Statistical Mechanics, PrenticeHall, Inc., London 1988. K. H. Michel and F. Schwabl, Hydrodynamic Modes in a Gas of Magnons, Phys. Kondens. Materie 11, 144 (1970).

476

9. The Boltzmann Equation

Problems for Chapter 9 9.1 Symmetry Relations. Demonstrate the validity of the identity (9.3.5) used to prove the H theorem: Z Z Z Z d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ; v3 , v4 )(f1 f2 − f3 f4 )ϕ1 Z Z Z Z 1 = d3 v1 d3 v2 d3 v3 d3 v4 W (v1 , v2 ;v3 , v4 ) 4 × (f1 f2 − f3 f4 )(ϕ1 + ϕ2 − ϕ3 − ϕ4 ) . (9.5.27) 9.2 The Flow Term in the Boltzmann Equation. Carry out the intermediate steps which lead from the equation of continuity (9.2.11) for the single-particle distribution function in µ−space to Eq. (9.2.11 ). 9.3 The Relation between H and S. Calculate the quantity Z

H(x, t) =

d3 v f (x, v, t) log f (x, v, t)

for the case that f (x, v, t) is the Maxwell distribution.

9.4 Show that in the absence of an external force, the equation of continuity (9.3.28) can be brought into the form (9.3.32) n(∂t + uj ∂j )e + ∂j qj = −Pij ∂i uj .

9.5 The Local Maxwell Distribution. Conﬁrm the statements made following Eq. (9.3.19 ) by inserting the local Maxwell distribution (9.3.19 ) into (9.3.15a)– (9.3.15c). 9.6 The Distribution of Collision Times. Consider a spherical particle of radius r, which is passing with velocity v through a cloud of similar particles with a particle density n. The particles deﬂect each other only when they come into direct contact. Determine the probability distribution for the event in which the particle experiences its ﬁrst collision after a time t. How long is the mean time between two collisions? 9.7 Equilibrium Expectation Values. Conﬁrm the results (G.1c) and (G.1g) for „

Z

d3 v

mv 2 2

«s

Z

f 0 (v)

and

d3 v vk vi vj vl f 0 (v) .

9.8 Calculate the scalar products used in Sect. 9.4.2: 1|1, |1, |, vi |vj , χ ˆ5 |χ ˆ4 , χ ˆ4 |vi χ ˆj , χ ˆ5 |vi χ ˆj , χ ˆ4 |vi2 χ ˆ4 , and vj |vi χ ˆ4 . 9.9 Sound Damping. In (9.4.30 ), (9.4.37 ) and (9.4.42 ), the linearized hydrodynamic equations for an ideal gas were derived. For real gases and liquids with general equations of state P = P (n, T ), analogous equations hold:

∂ T (x, t) + n ∂t

„

∂T ∂n

∂ n(x, t) + n∇ · u(x, t) = 0 ∂t ∂ mn uj (x, t) + ∂i Pji (x, t) = 0 ∂t « ∇ · u(x, t) − D∇2 T (x, t) = 0 . S

Problems for Chapter 9

477

The pressure tensor Pij , with components „ « 2 Pij = δij P − η (∇j ui + ∇i uj ) + η − ζ δij ∇ · u 3 now however contains an additional term on the diagonal, −ζ∇·u. This term results from the fact that real gases have a nonvanishing bulk viscosity (or compressional viscosity) ζ in addition to their shear viscosity η. Determine and discuss the modes. Hint: Keep in mind that the equations partially decouple if one separates the velocity ﬁeld into transverse and longitudinal components: u = ut +ul with ∇·ut = 0 and ∇ × ul = 0. (This can be carried out simply in Fourier space without loss of generality by taking the wavevector to lie along the z-direction.) In order to evaluate the dispersion equations (eigenfrequencies ω(k)) for the Fourier transforms of n, ul , and T , one can consider approximate solutions for ω(k) of successively increasing order in the magnitude of the wavevector k. A useful abbreviation is « « » « .„ « – „ « „ „ „ ∂T cP ∂T ∂P ∂P ∂P 1− = = . mc2s = ∂n S ∂n T ∂n S ∂n P ∂n T cV Here, cs is the adiabatic velocity of sound.

9.10 Show that D E E p p ˛ 1 vj ˛˛ 1 ˛vl pvr p p vi p = δji kT /m , = δlr kT /m , n/m n/kT n/kT n/m r r D v E D E D ˛ ˛ 4 2kT vr E 2kT j i 4 4˛ ˛ p ˆ = δij vi |χ ˆχ ˆ = δij vi χ = δlr , χ ˆ vl p 3m 3m n/m n/m

D

and verify (9.4.40 ). R d3 x dt e−i(kx−ωt) n(x, t)n(0, 0) and conﬁrm the result in (9.4.51) by transforming to Fourier space and expressing the ﬂuctuations at a given time in terms of thermodynamic derivatives (see also QM II, Sect. 4.7).

9.11 Calculate the density-density correlation function Snn (k, ω) =

R

9.12 The Viscosity of a Dilute Gas. In Sect. 9.4, the solution of the linearized Boltzmann equation was treated by using an expansion in terms of the eigenfunctions of the collision operator. Complete the calculation of the dissipative part of the momentum current, Eq. (9.4.40). Show that 5 X

p

λ=1

vr 5kT vj . |vi χ = δij δlr ˆλ χ ˆλ |vl p 3m n/m n/m

9.13 Heat Conductivity Using the Relaxation-Time Approach. A further possibility for the approximate determination of the dissipative contributions to the equations of motion for the conserved quantities particle number, momentum and energy is found in the relaxation-time approach introduced in Sect. 9.5.1: « f − f ∂f . =− ∂t collision τ For g = f − f , one obtains in lowest order from the Boltzmann equation (9.5.1)

478

9. The Boltzmann Equation « „ 1 g(x, v, t) = −τ ∂t + v · ∇ + F · ∇v f (x, v, t) . m

Eliminate the time derivative of f by employing the non-dissipative equations of motion obtained from f and determine the heat conductivity by inserting f = f +g into the expression for the heat current q derived in (9.3.29).

9.14 The Relaxation-Time Approach for the Electrical Conductivity. Consider an inﬁnite system of charged particles immersed in a positive background. The collision term describes collisions of the particles among themselves as well as with the (ﬁxed) ions of the background. Therefore, the collision term no longer vanishes for general local Maxwellian distributions f (x, v, t). Before the application of a weak homogeneous electric ﬁeld E, take f = f 0 , where f 0 is the positionand time-independent Maxwell distribution. Apply the relaxation-time approach ∂f /∂t|coll = −(f − f 0 )/τ and determine the new equilibrium distribution f to ﬁrst order in E after application of the ﬁeld. What do you ﬁnd for v? Generalize to a time-dependent ﬁeld E(t) = E0 cos(ωt). Discuss the eﬀects of the relaxation-time approximation on the conservation laws (see e.g. John M. Ziman, Principles of the Theory of Solids, 2nd Ed. (Cambridge University Press, Cambridge 1972)).

9.15 An example which is theoretically easy to treat but is unrealistic for atoms is the purely repulsive potential19 w(r) =

1 κ , ν − 1 r ν−1

ν ≥ 2, κ > 0 .

(9.5.28)

Show that the corresponding scattering cross-section has the form „ σ(ϑ, |v1 − v2 |) =

2κ m

«

2 ν−1

4

|v1 − v2 |− ν−1 Fν (ϑ) ,

(9.5.29)

with functions Fν (ϑ) which depend on ϑ and the power ν. For the special case of the so called Maxwell potential (ν = 5), |v1 − v2 |σ(ϑ, |v1 − v2 |) is independent of |v1 − v2 |.

9.16 Find the special local Maxwell distributions « „ v2 f 0 (v, x, t) = exp A + B · v + C 2m which are solutions of the Boltzmann equation, by comparing the coeﬃcients of the powers of v. The result is A = A1 + A2 · x + C3 x2 , B = B1 − A2 t − (2C3 t + C2 )x + Ω × x, C = C1 + C2 t + C3 t2 .

9.17 Let an external force F(x) = −∇V (x) act in the Boltzmann equation. Show that the collision term and the ﬂow term vanish for the case of the Maxwell distribution function „ «– » “ m ”3/2 m(v − u)2 1 . + V (x) exp − f (v, x) ∝ n 2πkT kT 2

9.18 Verify Eq. (9.4.33b). 19

Landau/Lifshitz, Mechanics, p. 51, op. cit. in footnote 18.

10. Irreversibility and the Approach to Equilibrium

10.1 Preliminary Remarks In this chapter, we will consider some basic aspects related to irreversible processes and their mathematical description, and to the derivation of macroscopic equations of motion from microscopic dynamics: classically from the Newtonian equations, and quantum-mechanically from the Schr¨odinger equation. These microscopic equations of motion are time-reversal invariant, and the question arises as to how it is possible that such equations can lead to expressions which do not exhibit time-reversal symmetry, such as the Boltzmann equation or the heat diﬀusion equation. This apparent incompatibility, which historically was raised in particular by Loschmidt as an objection to the Boltzmann equation, is called the Loschmidt paradox. Since during his lifetime the reality of atoms was not experimentally veriﬁable, the apparent contradiction between the time-reversal invariant (time-reversal symmetric) mechanics of atoms and the irreversibility of non-equilibrium thermodynamics was used by the opponents of Boltzmann’s ideas as an argument against the very existence of atoms1 . A second objection to the Boltzmann equation and to a purely mechanical foundation for thermodynamics came from the fact – which was proved with mathematical stringence by Poincar´e – that every ﬁnite system, no matter how large, must regain its initial state periodically after a so called recurrence time. This objection was named the Zermelo paradox, after its most vehement protagonist. Boltzmann was able to refute both of these objections. In his considerations, which were carried further by his student P. Ehrenfest2 , probability arguments play an important role, as they do in all areas of statistical mechanics – a way of thinking that was however foreign to the mechanistic worldview of physics at that time. We mention at this point that the entropy which is deﬁned in Eq. (2.3.1) in terms of the density matrix does not change within a closed system. In this chapter, we will denote the entropy deﬁned in this way as the Gibbs’ entropy. Boltzmann’s 1

2

See also the preface by H. Thirring in E. Broda, Ludwig Boltzmann, Deuticke, Wien, 1986. See P. Ehrenfest and T. Ehrenfest, Begriﬄiche Grundlagen der statistischen Auffassung in der Mechanik, Encykl. Math. Wiss. 4 (32) (1911); English translation by M. J. Moravcsik: The Conceptual Foundations of the Statistical Approach in Mechanics, Cornell University Press, Ithaca, NY 1959.

480

10. Irreversibility and the Approach to Equilibrium

concept of entropy, which dates from an earlier time, associates a particular value of the entropy not only to an ensemble but also to each microstate, as we shall show in more detail in Sect. 10.6.2. In equilibrium, Gibbs’ entropy is equal to Boltzmann’s entropy. To eliminate the recurrence-time objection, we will estimate the recurrence time on the basis of a simple model. Using a second simple model of the Brownian motion, we will investigate how its time behavior depends on the particle number and the diﬀerent time scales of the constituents. This will lead us to a general derivation of macroscopic hydrodynamic equations with dissipation from time-reversal invariant microscopic equations of motion. Finally, we will consider the tendency of a dilute gas to approach equilibrium, and its behavior under time reversal. In this connection, the inﬂuence of external perturbations will also be taken into account. In addition, this chapter contains an estimate of the size of statistical ﬂuctuations and a derivation of Pauli’s master equations. In this chapter, we treat a few signiﬁcant aspects of this extensive area of study. On the one hand, we will examine some simple models, and on the other, we will present qualitative considerations which will shed light on the subject from various sides. In order to illuminate the problem arising from the Loschmidt paradox, we show the time development of a gas in Fig. 10.1. The reader may conjecture that the time sequence is a,b,c, in which the gas expands to ﬁll the total available volume. If on the other hand a motion reversal is carried out at conﬁguration c, then the atoms will move back via stage b into conﬁguration a, which has a lower entropy. Two questions arise from this situation: (i) Why is the latter sequence (c,b,a) in fact never observed? (ii) How are we to understand the derivation of the H theorem, according to which the entropy always increases?

(a)

(b)

(c)

Fig. 10.1. Expansion or contraction of a gas: total volume V , subvolume V1 (cube in lower-left corner)

10.2 Recurrence Time

481

10.2 Recurrence Time Zermelo (1896)3 based his criticism of the Boltzmann equation on Poincar´e’s recurrence-time theorem4 . It states that a closed, ﬁnite, conservative system will return arbitrarily closely to its initial conﬁguration within a ﬁnite time, the Poincar´e recurrence time τP . According to Zermelo’s paradox, H(t) could not decrease monotonically, but instead must ﬁnally again increase and regain the value H(0). To adjudge this objection, we will estimate the recurrence time with the aid of a model5 . We consider a system of classical harmonic oscillators (linear chain) with displacements qn , momenta pn and the Hamiltonian (see QM II, Sect. 12.1): # N " 1 2 mΩ 2 2 p + (qn − qn−1 ) H= . 2m n 2 n=1

(10.2.1)

From this, the equations of motion are obtained: qn = mΩ 2 (qn+1 + qn−1 − 2qn ) . p˙ n = m¨

(10.2.2)

Assuming periodic boundary conditions, q0 = qN , we are dealing with a translationally invariant problem, which is diagonalized by the Fourier transformation m 1/2 1 isn qn = e Q , p = e−isn Ps . (10.2.3) s n N (mN )1/2 s s Qs and (Ps ) are called the normal coordinates (and momenta). The periodic boundary conditions require that 1 = eisN , i.e. s = 2πl N with integral l. The values of s for which l diﬀers by N are equivalent. A possible choice of values of l, e.g. for odd N , would be: l = 0, ±1, . . . , ±(N − 1)/2. Since qn and pn are real, it follows that Q∗s = Q−s and Ps∗ = P−s . The Fourier coeﬃcients obey the orthogonality relations N 1 isn −is n e e = ∆(s − s ) = N n=1

1 0

for s − s = 2πh with h integral otherwise (10.2.4)

3 4 5

E. Zermelo, Wied. Ann. 57, 485 (1896); ibid. 59, 793 (1896). H. Poincar´e, Acta Math. 13, 1 (1890) P. C. Hemmer, L. C. Maximon, and H. Wergeland, Phys. Rev. 111, 689 (1958).

482

10. Irreversibility and the Approach to Equilibrium

and the completeness relation 1 −isn isn e e = δnn . N s

(10.2.5)

Insertion of the transformation to normal coordinates yields 1 Ps Ps∗ + ωs2 Qs Q∗s H= 2 s

(10.2.6)

with the dispersion relation s ωs = 2Ω | sin | . 2

(10.2.7)

We thus ﬁnd N non-coupled oscillators with eigenfrequencies6 ωs . The motion of the normal coordinates can be represented most intuitively by introducing complex vectors Zs = Ps + iωs Qs ,

(10.2.8)

which move on a unit circle according to Zs = as eiωs t

(10.2.9)

with a complex amplitude as (Fig. 10.2).

Fig. 10.2. The motion of the normal coordinates

We assume that the frequencies ωs of N − 1 such normal coordinates are incommensurate, i.e. their ratios are not rational numbers. Then the phase vectors Zs rotate independently of one another, without coincidences. We now wish to calculate how much time passes until all N vectors again come into their initial positions, or more precisely, until all the vectors lie within an interval ∆ϕ around their initial positions. The probability that the vector Zs lies within ∆ϕ during one rotation is given by ∆ϕ/2π, and the probability that all the vectors lie within their respective prescribed intervals is (∆ϕ/2π)N −1 . The number of rotations required for this recurrence is thereN −1 . The recurrence time is found by multiplying by the typical fore (2π/∆ϕ) 6

The normal coordinate with s = 0, ωs = 0 corresponds to a translation and need not be considered in the following.

10.2 Recurrence Time

rotational period7

τP ≈

2π ∆ϕ

483

1 ω:

N −1 ·

1 . ω

(10.2.10)

2π , N = 10 and ω = 10 Hz, we obtain τP ≈ 1012 years, Taking ∆ϕ = 100 i.e. more than the age of the Universe. These times of course become much longer if we consider a macroscopic system with N ≈ 1020 . The recurrence thus exists theoretically, but in practice it plays no role. We have thereby eliminated Zermelo’s paradox.

Remark: We consider further the time dependence of the solution for the coupled oscillators. From (10.2.3) and (10.2.9) we obtain qn (t) =

” X eisn “ Q˙ s (0) √ Qs (0) cos ωs t + sin ωs t , ωs Nm s

(10.2.11)

from which the following solution of the general initial-value problem is found: ` ´ q˙ (0) ` ´” 1 X“ qn (t) = sin s(n−n )−ωs t . (10.2.12) qn (0) cos s(n−n )−ωs t + n N ωs s,n

As an example, we consider the particular initial condition qn = δn ,0 , q˙n (0) = 0, for which only the oscillator at the site 0 is displaced initially, leading to qn (t) =

` 1 X s ´ cos sn − 2Ω t | sin | . N s 2

(10.2.13)

As long as N is ﬁnite, the solution is quasiperiodic. On the other hand, in the limit N →∞ Z π Z π ` ` ´ 1 1 s ´ ds cos sn − 2Ω t | sin | = ds cos s2n − 2Ω t sin s qn (t) = 2π −π 2 π 0 r ` π´ 1 cos 2Ω t − πn − for long t . (10.2.14) = J2n (2Ω t) ∼ πΩ t 4 Jn are Bessel functions8 . The excitation does not decay exponentially, but instead algebraically as t−1/2 . We add a few more remarks concerning the properties of the solution (10.2.13) for ﬁnite N . If the zeroth atom in the chain is released at the time t = 0, it swings back and its neighbors begin to move upwards. The excitation propagates along the chain at the velocity of sound, aΩ; the n-th atom, at a distance d = na n . Here, a is the lattice constant. from the origin, reacts after a time of about t ∼ Ω 7

8

A more precise formula by P. C. Hemmer, L. C. Maximon, and H. Wergeland, QN−1 2π s=1 1 ∆ϕs ∝ ∆ϕ2−N . op. cit. 5, yields τP = PN−1 ωs N s=1 ∆ϕs I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series and Products, Academic Press, New York, 1980, 8.4.11 and 8.4.51

484

10. Irreversibility and the Approach to Equilibrium

The displacement amplitude remains largest for the zeroth atom. In a ﬁnite chain, there would be echo eﬀects. For periodic boundary conditions, the radiated oscillations come back again to the zeroth atom. The limit N → ∞ prevents Poincar´e recurrence. The displacement energy of the zeroth atom initially present is divided up among the inﬁnitely many degrees of freedom. The decrease of the oscillation amplitude of the initially excited atom is due to energy transfer to its neighbors.

10.3 The Origin of Irreversible Macroscopic Equations of Motion In this section, we investigate a microscopic model of Brownian motion. We will ﬁnd the appearance of irreversibility in the limit of inﬁnitely many degrees of freedom. The derivation of hydrodynamic equations of motion in analogy to the Brownian motion will be sketched at the end of this section and is given in more detail in Appendix H.. 10.3.1 A Microscopic Model for Brownian Motion As a microscopic model for Brownian motion, we consider a harmonic oscillator which is coupled to a harmonic lattice9 . Since the overall system is harmonic, the Hamiltonian function or the Hamiltonian operator as well as the equations of motion and their solutions have the same form classically and quantum mechanically. We start with the quantum-mechanical formulation. In contrast to the Langevin equation of Sect. 8.1, where a stochastic force was assumed to act on the Brownian particle, we now take explicit account of the many colliding particles of the lattice in the Hamiltonian operator and in the equations of motion. The Hamiltonian of this system is given by H = HO + HF + HI , 1 2 M Ω2 2 1 2 1 P + Q , HF = p + Φnn qn qn , 2M 2 2m n n 2 (10.3.1) nn HI = cn qn Q ,

HO =

n

where HO is the Hamiltonian of the oscillator of mass M and frequency Ω. Furthermore, HF is the Hamiltonian of the lattice10 with masses m, momenta pn , and displacements qn from the equilibrium positions, where we take m M . The harmonic interaction coeﬃcients of the lattice atoms are Φnn . The interaction of the oscillator with the lattice atoms is given by HI ; 9

10

The coupling to a bath of oscillators as a mechanism for damping has been investigated frequently, e.g. by F. Schwabl and W. Thirring, Ergeb. exakt. Naturwiss. 36, 219 (1964); A. Lopez, Z. Phys. 192, 63 (1965); P. Ullersma, Physica 32, 27 (1966). We use the index F , since in the limit N → ∞ the lattice becomes a ﬁeld.

10.3 The Origin of Irreversible Macroscopic Equations of Motion

485

the coeﬃcients cn characterize the strength and the range of the interactions of the oscillator which is located at the origin of the coordinate system. The vector n enumerates the atoms of the lattice. The equations of motion which follow from (10.3.1) are given by ¨ = −M Ω 2 Q − MQ cn qn n

and m¨ qn = −

Φnn qn − cn Q .

(10.3.2)

n

We take periodic boundary conditions, qn = qn+Ni , with N1 = (N1 , 0, 0), N2 = (0, N2 , 0), and N3 = (0, 0, N3 ), where Ni is the number of atoms in ˆi . Due to the translational invariance of HF , we introduce the the direction e following transformations to normal coordinates and momenta: m −ikan 1 ikan qn = √ e Q k , pn = e Pk . (10.3.3) N mN k

k

The inverse transformation is given by m −ikan 1 ikan e qn , Pk = √ e pn . Qk = N n mN n

(10.3.4)

The Fourier coeﬃcients obey orthogonality and completeness relations 1 i(k−k )·an 1 ik·(an −an ) e = ∆(k − k ) , e = δn,n (10.3.5a,b) N n N k 1 for k = g with the generalized Kronecker delta ∆(k) = . 0 otherwise From the periodic boundary conditions we ﬁnd the following values for the wavevector: r1 r2 r3 k = g1 + g2 + g3 with ri = 0, ±1, ±2, ... . N1 N2 N3 Here, we have introduced the reciprocal lattice vectors which are familiar from solid-state physics: g1 =

2π 2π 2π , 0, 0 , g2 = 0, , 0 , g3 = 0, 0, . a a a

The transformation to normal coordinates (10.3.3) converts the Hamiltonian for the lattice into the Hamiltonian for N decoupled oscillators, viz. 1 † HF = Pk Pk + ωk2 Q†k Qk , (10.3.6) 2 k

486

10. Irreversibility and the Approach to Equilibrium

with the frequencies11 (see Fig. 10.3) ωk2 =

1 Φ(n) e−ikan . m n

(10.3.7)

Fig. 10.3. The frequencies ωk along one of the coordinate axes, ωmax = ωπ/a

From the invariance of the lattice with respect to inﬁnitesimal trans lations, we obtain the condition n Φ(n, n ) = 0, and from translational invariance with respect to lattice vectors t, it follows that Φ(n + t, n + t) = Φ(n, n ) = Φ(n − n ). The latter relation was already used in (10.3.7). From the ﬁrst of the two relations, we ﬁnd limk→0 ωk2 = 0, i.e. the oscillations of the lattice are acoustic phonons. Expressed in terms of the normal coordinates, the equations of motion are (10.3.2) ¨ = −M Ω 2 Q − √ 1 MQ c(k)∗ Qk mN k ¨ k = −mω 2 Qk − m c(k) Q mQ k N

(10.3.8a) (10.3.8b)

with c(k) =

cn e−ik an .

(10.3.9)

n

For the further treatment of the equations of motion (10.3.8a,b) and the solution of the initial-value problem, we introduce the half-range Fourier transform (Laplace transform) of Q(t): ∞ ∞ ˜ Q(ω) ≡ dt eiωt Q(t) = dt eiωt Θ(t)Q(t) . (10.3.10a) 0

11

−∞

We assume that the harmonic potential for the heavy oscillator is based on the same microscopic interaction as that for p the lattice atoms, Φ(n, p gn ). If we denote g and ωmax = , and therefore its strength by g, then we ﬁnd Ω = M m Ω ωmax . The order of magnitude of the velocity of sound is c = aωmax .

10.3 The Origin of Irreversible Macroscopic Equations of Motion

The inverse of this equation is given by ∞ ˜ Θ(t)Q(t) = dω e−iωt Q(ω) .

487

(10.3.10b)

−∞

For free oscillatory motions, (10.3.10a) contains δ+ distributions. For their convenient treatment, it is expedient to consider ∞ ˜ Q(ω + iη) = dt ei(ω+iη)t Q(t) , (10.3.11a) 0

with η > 0. If (10.3.10a) exists, then with certainty so does (10.3.11a) owing to the factor e−ηt . The inverse of (10.3.11a) is given by ∞ ˜ + iη) , i.e. e−ηt Q(t) = dω e−iωt Q(ω −∞ ∞

Q(t)Θ(t) =

˜ + iη) . dω e−i(ω+iη)t Q(ω

(10.3.11b)

−∞

For the complex frequency appearing in (10.3.11a,b) we introduce z ≡ ω + iη. The integral (10.3.11b) implies an integration path in the complex z-plane which lies iη above the real axis

∞+iη

Q(t)Θ(t) =

˜ dz e−izt Q(z) .

(10.3.11b )

−∞+iη

The half-range Fourier transformation of the equation of motion (10.3.8a) yields for the ﬁrst term ∞ ∞ 2 izt d izt ˙ ∞ ˙ dt e Q(t) = e Q(t)|0 − iz dt eizt Q(t) dt2 0 0 ˙ ˜ = −Q(0) + izQ(0) − z 2 Q(z) . All together, for the half-range Fourier transform of the equations of motion (10.3.8a,b) we obtain 1 ˜ ˙ ˜ k (z) + M Q(0) M −z 2 + Ω 2 Q(z) = −√ c(k)∗ Q − iz Q(0) mN k (10.3.12) ˜ ˜ k (z) = − m c(k) Q(z) + m Q˙ k (0) − izQk (0) . m −z 2 + ωk2 Q N (10.3.13) ˜ k (z) and replacement of the initial values Qk (0), Q˙ k (0) The elimination of Q by qn (0), q˙n (0) yields

488

10. Irreversibility and the Approach to Equilibrium

˜ ˙ D(z) Q(z) = M Q(0) − iz Q(0) −

e−ik an m q˙n (0) − iz qn (0) (10.3.14) c(k)∗ 2 2 N n m(−z + ωk ) k

with

2 1 |c(k)|2 2 . D(z) ≡ M −z + Ω + N m(z 2 − ωk2 )

(10.3.15)

k

Now we restrict ourselves to the classical case, and insert the particular initial values for the lattice atoms qn (0) = 0, q˙n (0) = 0 for all the n12 , then we ﬁnd ˜ Q(z) =

˙ M (Q(0) − izQ(0)) . 2 2 2 −M z 2 + M Ω 2 − k |c(k)| m N /(−z + ωk )

From this, in the time representation, we obtain dω −izt ˜ e g(ων ) e−iων t , Θ(t)Q(t) = Q(z) = −i 2π ν

(10.3.16)

(10.3.17)

˜ where ων are the poles of Q(z) and g(ων ) are the residues13 . The solution is thus quasiperiodic. One could use this to estimate the Poincar´e time in analogy to the previous section. In the limit of a large particle number N , the sums over k can be replaced by integrals and a diﬀerent analytic behavior may result:14 d3 k |c(k)|2 a3 D(z) = −M z 2 + M Ω 2 + . (10.3.18) m (2π)3 z 2 − ωk2 The integral over k spans the ﬁrst Brillouin zone: − πa ≤ ki ≤ πa . For a simple evaluation of the integral over k, we replace the region of integration by a 3 1/3 2π sphere of the same volume having a radius Λ = 4π a and substitute 12

13

14

In the quantum-mechanical treatment, we would have to use the expectation value of (10.3.14) instead and insert qn (0) = q˙n (0) = 0. In problem 10.6, the force on the oscillator due to the lattice particles is investigated when the latter are in thermal equilibrium. ˜ The poles of Q(z), z ≡ ω + iη are real, i.e. they lie in the complex ω-plane below the real axis. (10.3.17) follows with the residue theorem by closure of the integration in the lower half-plane. In order to determine what the ratio of t and N must be to permit the use of the limit N → ∞ even for ﬁnite N , the N −dependence of the poles ων must be 1 the found from D(z) = 0. The distance ` 1 ´ between the poles ων is ∆ων ∼ N , and . values of the residues are of O N . The frequencies ων obey ων+1 − ων ∼ ωmax N N , the phase factors eiων t vary only weakly as a function of ν, and For t ωmax the sum over ν in (10.3.17) can be replaced by an integral.

10.3 The Origin of Irreversible Macroscopic Equations of Motion

489

the dispersion relation by ωk = c|k| where c is the velocity of sound. It then follows that Λc Λc 3 a3 1 dν ν 2 1 a 2 − |c(ν)| = dν|c(ν)|2 + m 2π 2 c3 0 z 2 − ν 2 m 2π 2 c3 0 ∞ ∞ dν |c(ν)|2 dν |c(ν)|2 2 2 (10.3.19) −z +z z2 − ν2 z2 − ν2 0 Λc with ν = c|k|. We now discuss the last equation term by term making use of the simpliﬁcation |c(ν)|2 = g 2 corresponding to cn = gδn,0 . 1st term of (10.3.19): Λc a3 1 − dν|c(ν)|2 = −g 2 Λc . (10.3.20) m 2π 2 c3 0 This yields a renormalization of the oscillator frequency 1 a3 . ω ¯ = Ω 2 − g 2 Λc 2 3 m2π c M

(10.3.21)

2nd term of (10.3.19) and evaluation using the theorem of residues: ∞ a3 1 dν 2 2 g z = −M Γ i z (10.3.22) 2 3 2 m 2π c z − ν2 0 m g 2 a3 1 = cΛ . (10.3.23) Γ = 4πmc3 M M The third term of (10.3.19) is due to the high frequencies and aﬀects the behavior at very short times. This eﬀect is treated in problem 10.5, where a continuous cutoﬀ function is employed. If we neglect it, we obtain from (10.3.16) 2 ˜ ˙ −z + ω ¯ 2 − iΓ z Q(z) = M Q(0) − izQ(0) , (10.3.24) and, after transformation into the time domain for t > 0, we have the following equation of motion for Q(t):

2 d d 2 Q(t) = 0 . (10.3.25) +ω ¯ +Γ dt2 dt The coupling to the bath of oscillators leads to a frictional term and to irreversible damped motion. For example, let the initial values be Q(0) = 0, ˙ = 0) = Q(0) ˙ Q(t (for the lattice oscillators, we have already set qn (0) = q˙n (0) = 0); then from Eq. (10.3.24) it follows that ∞ ˙ dω e−izt Q(0) Θ(t) Q(t) = (10.3.26) 2 2 ¯ − iΓ z −∞ 2π −z + ω

490

10. Irreversibility and the Approach to Equilibrium

and, using the theorem of residues, Q(t) = e−Γ t/2

sin ω0 t ˙ Q(0) , ω0

(10.3.27)

9 2 ¯ 2 − Γ4 . with ω0 = ω The conditions for the derivation of the irreversible equation of motion (10.3.25) were: N 15 a) A limitation to times t ωmax . This implies practically no limitation for large N , since the exponential decay is much more rapid. b) The separation into macroscopic variables ≡ massive oscillator (of mass M ) and microscopic variables ≡ lattice oscillators (of mass m) leads, m owing to M 1, to a separation of time scales

Ω ωmax , Γ ωmax . The time scales of the macroscopic variables are 1/Ω, 1/Γ . The irreversibility (exponential damping) arises in going to the limit N → ∞. In order to obtain irreversibility even at arbitrarily long times, the limit N → ∞ must ﬁrst be taken. 10.3.2 Microscopic Time-Reversible and Macroscopic Irreversible Equations of Motion, Hydrodynamics The derivation of hydrodynamic equations of motion (Appendix H.) directly from the microscopic equations is based on the following elements: (i) The point of departure is represented by the equations of motion for the conserved quantities and the equations of motion for the inﬁnitely many nonconserved quantities. (ii) An important precondition is the separation of time scales ck ωn.c., i.e. the characteristic frequencies of the conserved quantities ck are much slower than the typical frequencies of the nonconserved quantities ωn.c., analogous to the ωλ (λ > 5) in the Boltzmann equation, Sect. 9.4.4. This permits the elimination of the rapid variables. In the analytic treatment in Appendix H., one starts from the equations of motion for the so called Kubo relaxation function φ and obtains equations of motion for the relaxation functions of the conserved quantities. From the oneto-one correspondence of equations of motion for φ and the time-dependent expectation values of operators under the inﬂuence of a perturbation, the hydrodynamic equations for the conserved quantities are obtained. The remaining variables express themselves in the form of damping terms, which can be expressed by Kubo formulas. 15

These times, albeit long, are much shorter than the Poincar´e recurrence time.

10.4 The Master Equation and Irreversibility in Quantum Mechanics

491

∗

10.4 The Master Equation and Irreversibility in Quantum Mechanics16 We consider an isolated system and its density matrix at the time t, with probabilities wi (t) wi (t) |i i| . (10.4.1) "(t) = i

The states |i are eigenstates of the Hamiltonian H0 . We let the quantum numbers i represent the energy Ei and a series of additional quantum numbers νi . A perturbation V also acts on the system or within it and causes transitions between the states; thus the overall Hamiltonian is H = H0 + V .

(10.4.2)

For example, in a nearly ideal gas, H0 could be the kinetic energy and V the interaction which results from collisions of the atoms. We next consider the time development of " on the basis of (10.4.1) and denote the timedevelopment operator by U (τ ). After the time τ the density matrix has the form "(t + τ ) = wi (t)U (τ ) |i i| U † (τ ) i

=

i

=

(10.4.3)

j,k

i

wi (t) |j j| U (τ ) |i i| U † (τ ) |k k| ∗ wi (t) |j k| Uji (τ ) Uki (τ ) ,

j,k

where the matrix elements Uji (τ ) ≡ j| U (τ ) |i

(10.4.4)

have been introduced. We assume that the system, even though it is practically isolated, is in fact subject to a phase averaging at each instant as a result of weak contacts to other macroscopic systems. This corresponds to taking the trace over other, unobserved degrees of freedom which are coupled to the system17 . Then the density matrix (10.4.3) is transformed to 16 17

W. Pauli, Sommerfeld Festschrift, S. Hirzel, Leipzig, 1928, p. 30. If for example every state |j of the system is connected with a state |2, j of these other macroscopic degrees of freedom, so that the contributions to the total density matrix are of the form |2, j |j k| 2, k| , then taking the trace over 2 leads to the diagonal form |j j|. This stochastic nature, which is introduced through contact to the system’s surroundings, is the decisive and subtle step in the derivation of the master equation. Cf. N. G. van Kampen, Physica 20, 603 (1954), and Fortschritte der Physik 4, 405 (1956).

492

10. Irreversibility and the Approach to Equilibrium

i

∗ wi (t) |j j| Uji (τ )Uji (τ ) .

(10.4.5)

j

Comparison with (10.4.1) shows that the probability for the state |j at the time t + τ is thus wj (t + τ ) = wi (t)|Uji (τ )|2 , i

and the change in the probability is wi (t) − wj (t) |Uji (τ )|2 , wj (t + τ ) − wj (t) =

(10.4.6)

i

where we have used i |Uji (τ )|2 = 1. On the right-hand side, the term i = j vanishes. We thus require only the nondiagonal elements of Uij (τ ), for which we can use the Golden Rule18 : 1 |Uji (τ )| = 2

2

sin ωij τ /2 ωij /2

2 | j| V |i |2 = τ

2π δ(Ei − Ej )| j| V |i |2 (10.4.7)

with ωij = (Ei − Ej )/. The limit of validity of the Golden Rule is ∆E 2π τ δε, where ∆E is the width of the energy distribution of the states and δε is the spacing of the energy levels. From (10.4.6) and (10.4.7), it follows that 2π dwj (t) = δ(Ei − Ej )| j| V |i |2 . wi (t) − wj (t) dt i As already mentioned at the beginning of this section, the index i ≡ (Ei , νi ) includes the quantum numbers of the energy and the νi , the large number of all remaining quantum numbers. The sum over the energy eigenvalues on the right-hand side can be replaced by an integral with the density of states "(Ei ) according to · · · = dEi "(Ei ) · · · Ei

so that, making use of the δ-function, we obtain: dwEj νj (t) 2π = (wEj ,νi − wEj ,νj ) "(Ej )| Ej , νj | V |Ej , νi |2 . (10.4.8) dt ν i

18

QM I, Eq. (16.36)

10.4 The Master Equation and Irreversibility in Quantum Mechanics

493

With the coeﬃcients λEj ,νj ;νi =

2π "(Ej )| Ej , νj | V |Ej , νi |2 ,

(10.4.9)

Pauli’s master equation follows: dwEj νj (t) = λEj ,νj ;νi wEj ,νi (t) − wEj ,νj (t) . dt ν

(10.4.10)

i

This equation has the general structure (Wn n pn − Wn n pn ) , p˙ n =

(10.4.11)

n

where the transition rates Wn n = Wn n obey the so called detailed balance condition19 eq Wn n peq n = Wn n pn

(10.4.12)

eq for the microcanonical ensemble, peq n = pn for all n and n . One can show in general that Eq. (10.4.11) is irreversible and that the entropy S=− pn log pn (10.4.13) n

increases. With (10.4.11), we have (pn log pn ) Wn n (pn − pn ) S˙ = − n, n

=

Wn n pn (pn log pn ) − (pn log pn ) .

n,n

By permutation of the summation indices n and n and using the symmetry relation Wn n = Wn n , we obtain 1 Wn n (pn − pn ) (pn log pn ) − (pn log pn ) > 0 , (10.4.14) S˙ = 2 n,n

where the inequality follows from the convexity of x log x (Fig. 10.4). The entropy continues to increase until pn = pn for all n and n . Here we assume that all the n and n are connected via a chain of matrix elements. The isolated system described by the master equation (10.4.10) approaches the microcanonical equilibrium.

19

See QM II, following Eq. (4.2.17).

494

10. Irreversibility and the Approach to Equilibrium

Fig. 10.4. The function f (x) = x log x is convex, (x −x)(f (x )−f (x)) > 0

10.5 Probability and Phase-Space Volume ∗

10.5.1 Probabilities and the Time Interval of Large Fluctuations

In the framework of equilibrium statistical mechanics one can calculate the probability that the system spontaneously takes on a constraint. In the context of the Gay-Lussac experiment, we found the probability that a system with a ﬁxed particle number N with a total volume V would be found only within the subvolume V1 (Eq. (3.5.5)): W (E, V1 ) = e−(S(E,V )−S(E,V1 ))/k .

(10.5.1)

For an ideal gas20 , the entropy is S(E, V ) = kN (log NVλ3 + E=

3 2 N kT ,

T

5 2 ).

Since

λT remains unchanged on expansion and it follows that

1 for low W (E, V1 ) = e−N log V1 . This gives log VV1 = log V −(VV −V1 ) ≈ V −V V V compressions. At higher compressions, V1 0. The phase space of these states is the same size as the phase space at the time t = 0; it is thus considerably smaller than that of all the states which represent the macrostate at times t > 0. The state Tt X contains complex correlations. The typical microstates of M (t) lack these correlations. They become apparent upon time reversal. In the forward direction of time, in contrast, the future of such atypical microstates is just the same as that of the typical states.

502

10. Irreversibility and the Approach to Equilibrium

Fig. 10.6. The entropy as a function of time in the expansion of a computer gas consisting of 864 atoms. In the initial stages, all the curves lie on top of one another. (1) The unperturbed expansion of V1 to V (solid curve). (2) Time reversal at t = 94.4 (dashed curve), the system returns to its initial state and the entropy to its initial value. (3) A perturbation # at t = 18.88 and time reversal at t = 30.68. The system approaches its initial state closely (dotted curve). (4) A perturbation # at t = 59 and time reversal at t = 70.8 (chain curve). Only for a short time after the time reversal does the entropy decrease; it then increases towards its equilibrium value.32

together within the original subvolume30 . It is apparent that the initial state which we deﬁned at the beginning leads in the course of time to a state which is not typical of a gas with the density shown in Fig. 10.1 c) and a Maxwell distribution. A typical microstate for such a gas would never compress itself into a subvolume after a time reversal. States which develop in such a correlated manner and which are not typical will be termed quasi-equilibrium states 31 , also called local quasi-equilibrium states during the intermediate stages of the time development. Quasi-equilibrium states have the property that their macroscopic appearance is not invariant under time reversal. Although these quasi-equilibrium states of isolated systems doubtless exist and their time-reversed counterparts can be visualized in the computer experiment, the latter would seem to have no signiﬁcance in reality. Thus, why was Boltzmann nevertheless correct in his statement that the entropy SB always increases monotonically apart from small ﬂuctuations? 30

31

32

The associated “coarse-grained” Boltzmann entropy (10.7.1) decreases following the time reversal, curve (2) in Fig. 10.6. A time dependence of this type is not described by the Boltzmann equation and is also never observed in Nature. J. M. Blatt, An Alternative Approach to the Ergodic Pr