Mathematics / Differential Geometry / Manifolds, forms, and Lie derivatives

Differential Geometry notes

Forms, Flows, and Lie Derivatives Manifolds and Tangent Spaces Manifolds, Forms, Lie Derivatives, and Lie Groups

Sections

Reading Guide

This note collects the differential-geometric background needed before connections and curvature become natural: manifolds, tangent and cotangent spaces, differential forms, pullbacks, vector fields, flows, Lie derivatives, Lie groups, and Lie algebras.

Read it as the geometric foundation for the fiber-bundle notes rather than as a replacement for a full differential-geometry course.

Shorter entry points now live beside this comprehensive version:

Manifolds and Tangent Spaces
Forms, Flows, and Lie Derivatives

Topological Spaces, Manifolds, and Smooth Structure

Topological spaces and open sets

Before one can define a manifold or a fibre bundle, one needs a notion of open sets.

Definition 1 (Topological space). A topological space is a set \(X\) together with a collection \(\mathcal T\) of subsets of \(X\), called open sets, such that:

\(\varnothing\) and \(X\) are open.
Arbitrary unions of open sets are open.
Finite intersections of open sets are open.

The pair \((X,\mathcal T)\) is called a topological space.

In most geometry examples, the topology is the ordinary topology on \(\mathbb{R}^n\) or something built from it. A map \[f:X\to Y\] between topological spaces is continuous if the inverse image of every open set is open: \[V\subset Y\text{ open} \quad \Longrightarrow \quad f^{-1}(V)\subset X\text{ open}.\]

Definition 2 (Homeomorphism). A map \(f:X\to Y\) is a homeomorphism if it is continuous, bijective, and its inverse \(f^{-1}:Y\to X\) is also continuous. If such an \(f\) exists, \(X\) and \(Y\) are topologically the same space.

Open covers

An open cover of a topological space \(X\) is a collection of open sets \(\{U_i\}_{i\in I}\) such that \[X=\bigcup_{i\in I} U_i.\] The index set \(I\) can be finite or infinite. A cover is simply a way of saying that we will study \(X\) patch by patch.

Covers in geometry Most definitions in manifolds and bundles are local. This means they are checked on an open cover, then glued together on overlaps \(U_i\cap U_j\).

Topological manifolds

A manifold is a space that looks locally like Euclidean space.

Definition 3 (Topological \(n\)-manifold). An \(n\)-dimensional topological manifold is a topological space \(M\) satisfying the following standard conditions:

\(M\) is Hausdorff: distinct points can be separated by disjoint open sets.
\(M\) is second-countable: its topology has a countable basis.
\(M\) is locally Euclidean of dimension \(n\): for every \(p\in M\), there exists an open neighbourhood \(U\subset M\) of \(p\) and a homeomorphism \[\varphi:U\to \varphi(U)\subset \mathbb{R}^n,\] where \(\varphi(U)\) is open in \(\mathbb{R}^n\).

The pair \((U,\varphi)\) is called a coordinate chart.

The phrase “locally homeomorphic to an open subset of \(\mathbb{R}^n\)” is the core idea. The Hausdorff and second-countable conditions are usually included to rule out pathological spaces and to make analysis on manifolds behave as expected.

Examples

\(\mathbb{R}^n\) is an \(n\)-dimensional topological manifold.
\(S^1\) is a one-dimensional topological manifold.
\(S^2\) is a two-dimensional topological manifold.
\(T^2=S^1\times S^1\) is a two-dimensional topological manifold.

Smooth manifolds

A topological manifold only knows about continuous coordinate changes. Differential geometry needs differentiable coordinate changes.

Let \((U_i,\varphi_i)\) and \((U_j,\varphi_j)\) be two charts. On the overlap \(U_i\cap U_j\), the coordinate transition map is \[\varphi_j\circ \varphi_i^{-1}:\varphi_i(U_i\cap U_j)\to \varphi_j(U_i\cap U_j).\] This is a map between open subsets of \(\mathbb{R}^n\).

Definition 4 (Smooth manifold). A smooth \(n\)-manifold is a topological \(n\)-manifold equipped with an atlas of charts whose transition maps are smooth. Usually one takes a maximal smooth atlas, meaning all charts compatible with the chosen smooth structure are included.

After choosing a smooth structure, one can define smooth functions, smooth maps, tangent vectors, differential forms, and smooth fibre bundles.

A diffeomorphism is a smooth bijection \[\phi:M\to N\] whose inverse \[\phi^{-1}:N\to M\] is also smooth. Diffeomorphic smooth manifolds are considered the same for differential-geometric purposes.

Compactness is not part of the definition A manifold need not be compact. \(\mathbb{R}^n\) is noncompact; \(S^n\) and \(T^n\) are compact. Compactness is useful for some integration and finite-cover arguments, but the definitions of manifolds, fibre bundles, vector bundles, and connections do not require compactness.

Local coordinates

If \(q\in U\subset M\) and \[\varphi(q)=(x^1(q),\ldots,x^n(q)),\] then \(x^1,\ldots,x^n\) are local coordinate functions on \(U\).

Local coordinates are powerful but never sacred. On \(S^2\), longitude and latitude fail at the poles. This is not merely a technical nuisance. It is the first sign of a general theme:

Local descriptions need transition rules Whenever one description works only on a patch, one must know how to translate between descriptions on overlaps of patches.

If \(U_i\) and \(U_j\) are two coordinate patches with coordinates \(x^\mu\) and \(y^a\), then on \(U_i\cap U_j\), \[y^a=y^a(x^1,\ldots,x^n).\] These coordinate transition functions are the manifold analogue of transition functions for bundles.

Maps between manifolds

A smooth map \[f:M\to N\] assigns to each point \(p\in M\) a point \(f(p)\in N\). The notation \[C^\infty(M)\] means the set, in fact algebra, of smooth real-valued functions on \(M\). In local coordinates, if \(x^a\) are coordinates on \(M\) and \(y^i\) are coordinates on \(N\), then the map is written as functions \[y^i=f^i(x^1,\ldots,x^m).\] The same map gives two important operations:

It pushes tangent vectors forward: \(f_*:T_pM\to T_{f(p)}N\).
It pulls differential forms back: \(f^*:\Omega^k(N)\to\Omega^k(M)\).

We define these after tangent vectors and forms.

Tangent Vectors as Chart-Independent First-Order Data

The notes so far have treated a manifold as a space that can be studied in coordinate patches. We now define tangent vectors carefully enough that later shorthand such as \[\frac{d}{dt}\gamma(t), \qquad \frac{\partial}{\partial x^\mu}, \qquad f_*v\] has an exact meaning. The central point is this: tangent vectors are local first-order data at one point. Tangent spaces at different points are usually different vector spaces, even if the two points lie in the same coordinate chart.

Tangent vectors from curves: the chart definition

Let \(M\) be a smooth \(n\)-manifold and let \(p\in M\). Consider smooth curves \[\gamma:(-\epsilon,\epsilon)\to M, \qquad \gamma(0)=p.\] Intuitively, the tangent vector of \(\gamma\) at \(p\) should remember only the first-order velocity of \(\gamma\) at \(t=0\), not the entire curve.

Choose a chart \((U,\varphi)\) around \(p\), with \[\varphi=(x^1,\ldots,x^n):U\to \varphi(U)\subset \mathbb{R}^n.\] For \(t\) small enough, \(\gamma(t)\in U\), and the coordinate representation of the curve is \[\varphi\circ\gamma:(-\delta,\delta)\to \mathbb{R}^n.\] Its ordinary Euclidean derivative at \(0\) is \[\left.\frac{d}{dt}\right|_{t=0}(\varphi\circ\gamma)(t) = \left( \left.\frac{d}{dt}\right|_{0}x^1(\gamma(t)), \ldots, \left.\frac{d}{dt}\right|_{0}x^n(\gamma(t)) \right).\]

Definition 5 (Equivalence of curves through \(p\)). Two smooth curves \(\gamma_1\) and \(\gamma_2\) through \(p\) are tangent at \(p\) if, for one hence every chart \((U,\varphi)\) around \(p\), \[\left.\frac{d}{dt}\right|_{0}(\varphi\circ\gamma_1)(t) = \left.\frac{d}{dt}\right|_{0}(\varphi\circ\gamma_2)(t).\] The equivalence class of \(\gamma\) is denoted \([\gamma]\).

The phrase “for one hence every chart” is important. If the equality holds in one coordinate system, then it holds in any other coordinate system because coordinate changes are smooth and the chain rule applies.

Definition 6 (Tangent space via curves). The tangent space \(T_pM\) is the set of equivalence classes of smooth curves through \(p\): \[T_pM:=\{[\gamma]:\gamma(0)=p\}.\] Vector addition and scalar multiplication are defined by transporting the velocity vectors to a chart, doing the usual linear algebra in \(\mathbb{R}^n\), and then transporting the result back. The chain rule shows this does not depend on the chart.

If \(M\) is \(n\)-dimensional, then \(T_pM\) is an \(n\)-dimensional real vector space.

Coordinate basis vectors

Let \((U,x^1,\ldots,x^n)\) be a chart around \(p\). The coordinate basis vector \[\left.\frac{\partial}{\partial x^\mu}\right|_p\in T_pM\] is the tangent vector represented by the curve which, in coordinates, moves only in the \(x^\mu\) direction. More explicitly, if \(a=\varphi(p)\in\mathbb{R}^n\), let \[c_\mu(t)=\varphi^{-1}(a+t e_\mu),\] where \(e_\mu\) is the usual \(\mu\)-th basis vector of \(\mathbb{R}^n\). Then \[\left.\frac{\partial}{\partial x^\mu}\right|_p := [c_\mu].\]

Every tangent vector \(v\in T_pM\) has a unique expression \[v=v^\mu\left.\frac{\partial}{\partial x^\mu}\right|_p.\] The numbers \(v^\mu\) depend on the chart, but the vector \(v\) does not.

Same chart, different tangent spaces If \(p,q\in U\) lie in the same coordinate chart, then \(T_pM\) and \(T_qM\) are still different vector spaces. The chart gives bases \[\left.\frac{\partial}{\partial x^\mu}\right|_p\in T_pM, \qquad \left.\frac{\partial}{\partial x^\mu}\right|_q\in T_qM,\] and it may tempt us to identify the two tangent spaces by matching coordinate components. This identification is convenient but not intrinsic. It depends on the chosen chart. A nonlinear change of coordinates changes the matching rule differently at different points. A connection is precisely the extra structure that later lets us compare nearby tangent spaces in a geometrically controlled way.

On \(\mathbb{R}^n\) there is a canonical global identification \(T_p\mathbb{R}^n\cong\mathbb{R}^n\) for every \(p\), because \(\mathbb{R}^n\) is itself a vector space. A general manifold has no such preferred identification.

Tangent vectors as derivations

There is an equivalent definition that is often cleaner for calculations. Let \(C^\infty(M)\) be the algebra of smooth real-valued functions on \(M\).

Definition 7 (Derivation at a point). A derivation at \(p\) is a linear map \[D:C^\infty(M)\to\mathbb{R}\] satisfying the Leibniz rule at \(p\): \[D(fh)=f(p)D(h)+h(p)D(f)\] for all \(f,h\in C^\infty(M)\).

A curve \(\gamma\) through \(p\) defines such a derivation by \[D_\gamma(f):=\left.\frac{d}{dt}\right|_{t=0} f(\gamma(t)).\] If two curves are tangent in the coordinate sense, they define the same derivation. Conversely, every derivation arises from a tangent vector. Thus one may equivalently define \[T_pM=\{\text{derivations at }p\}.\]

In this language, the coordinate basis vector is the derivation \[\left.\frac{\partial}{\partial x^\mu}\right|_p[f] = \left.\frac{\partial}{\partial u^\mu}\right|_{u=\varphi(p)} \bigl(f\circ\varphi^{-1}\bigr)(u),\] where \(u=(u^1,\ldots,u^n)\) are the standard coordinates on \(\mathbb{R}^n\).

Therefore, if \[v=v^\mu\left.\frac{\partial}{\partial x^\mu}\right|_p,\] then \[v[f]=v^\mu \left.\frac{\partial}{\partial u^\mu}\right|_{u=\varphi(p)} (f\circ\varphi^{-1})(u).\] This is the precise meaning of the common shorthand \[v[f]=v^\mu \partial_\mu f(p).\]

When shorthand begins From now on we will often write \[\partial_\mu:=\frac{\partial}{\partial x^\mu}\] and omit the vertical bar \(|_p\) when the base point is clear. The rigorous meaning is always \[\left.\partial_\mu\right|_p\in T_pM.\] The same symbol \(\partial_\mu\) used at two different points refers to two different tangent vectors living in two different tangent spaces.

Velocities of curves are pushforwards of \(d/dt\)

Let \[\gamma:I\to M\] be a smooth curve defined on an interval \(I\subset\mathbb{R}\). At \(t_0\in I\), the tangent space \(T_{t_0}I\) is canonically spanned by \[\left.\frac{d}{dt}\right|_{t_0}.\] The velocity of \(\gamma\) at \(t_0\) is defined by the pushforward \[\dot\gamma(t_0):=(\gamma_{*})_{t_0}\left(\left.\frac{d}{dt}\right|_{t_0}\right) \in T_{\gamma(t_0)}M.\] As a derivation, this means \[\dot\gamma(t_0)[f] = \left.\frac{d}{dt}\right|_{t=t_0} f(\gamma(t)).\] Thus the shorthand \(\dot\gamma(t)\) is not an informal Euclidean derivative. It is the pushforward of the canonical tangent vector on the parameter interval.

Vector fields

A vector field \(X\) on \(M\) assigns to each point \(p\in M\) a tangent vector \[X_p\in T_pM.\] In a chart, \[X_p=X^\mu(p)\left.\frac{\partial}{\partial x^\mu}\right|_p.\] The vector field is smooth if the coefficient functions \(X^\mu\) are smooth in every chart. Later we will express this by saying that a vector field is a smooth section of the tangent bundle: \[X\in\Gamma(TM).\] Before bundles are formally defined, it is enough to remember that \(X_p\) lives in \(T_pM\), and different points have different tangent spaces.

Pushforward

Let \(f:M\to N\) be a smooth map. It sends points of \(M\) to points of \(N\). It also sends tangent vectors at \(p\) to tangent vectors at \(f(p)\).

Definition 8 (Pushforward). For \(v\in T_pM\), the pushforward of \(v\) by \(f\) is the tangent vector \[f_{*p}v\in T_{f(p)}N\] defined as a derivation by \[(f_{*p}v)[h]=v[h\circ f]\] for every \(h\in C^\infty(N)\).

This definition is forced: to differentiate a function \(h\) on \(N\) in the direction \(f_*v\), first pull it back to the function \(h\circ f\) on \(M\), then differentiate using \(v\).

In coordinates, suppose \(x^a\) are coordinates on \(M\), \(y^i\) are coordinates on \(N\), and \[y^i=f^i(x^1,\ldots,x^m).\] If \[v=v^a\left.\frac{\partial}{\partial x^a}\right|_p,\] then \[f_{*p}v = v^a \frac{\partial f^i}{\partial x^a}(p) \left.\frac{\partial}{\partial y^i}\right|_{f(p)}.\] So the pushforward is the Jacobian acting on tangent vectors, but the intrinsic definition is the derivation formula above.

Memory rule Tangent vectors push forward naturally. Differential forms pull back naturally.

Cotangent Vectors and Differential Forms

The cotangent space \(T_p^*M\)

The cotangent space at \(p\) is the dual vector space of \(T_pM\): \[T_p^*M:=\mathrm{Hom}(T_pM,\mathbb{R}).\] So an element of \(T_p^*M\) is a linear map \[\alpha_p:T_pM\to\mathbb{R}.\] Such an object is called a covector or one-form at \(p\).

In local coordinates, the basis of \(T_p^*M\) is \[dx^1_p,\ldots,dx^n_p.\] These are defined to be the dual basis to \[\left.\frac{\partial}{\partial x^1}\right|_p, \ldots, \left.\frac{\partial}{\partial x^n}\right|_p.\] This means \[dx^\mu_p\left(\left.\frac{\partial}{\partial x^\nu}\right|_p\right)=\delta^\mu_{\nu}.\]

The basic pairing The expression \[dx^\mu\left(\frac{\partial}{\partial x^\nu}\right)=\delta^\mu_{\nu}\] means: the covector \(dx^\mu\) is being evaluated on the vector \(\partial/\partial x^\nu\). It returns \(1\) if the directions match and \(0\) otherwise.

Why \(dx^\mu(\partial/\partial x^\nu)=\delta^\mu_\nu\)

First recall that \(x^\mu\) is a coordinate function: \[x^\mu:M\to\mathbb{R}.\] Its differential is a one-form \[dx^\mu_p:T_pM\to\mathbb{R}.\] For any tangent vector \(v\in T_pM\), the differential is defined by \[dx^\mu_p(v)=v[x^\mu].\] Now take \[v=\left.\frac{\partial}{\partial x^\nu}\right|_p.\] Then \[\begin{align*} dx^\mu_p\left(\left.\frac{\partial}{\partial x^\nu}\right|_p\right) &=\left.\frac{\partial}{\partial x^\nu}\right|_p[x^\mu]\\ &=\frac{\partial x^\mu}{\partial x^\nu}(p)\\ &=\delta^\mu_{\nu}. \end{align*}\]

For example, in \(\mathbb{R}^2\) with coordinates \((x,y)\), \[dx\left(\frac{\partial}{\partial x}\right)=1, \qquad dx\left(\frac{\partial}{\partial y}\right)=0,\] \[dy\left(\frac{\partial}{\partial x}\right)=0, \qquad dy\left(\frac{\partial}{\partial y}\right)=1.\] So \(dx\) measures the \(x\) component of a tangent vector, and \(dy\) measures the \(y\) component.

If \[v=v^x\frac{\partial}{\partial x}+v^y\frac{\partial}{\partial y},\] then \[dx(v)=v^x, \qquad dy(v)=v^y.\]

Differential forms are not defined only for integration

A common first exposure to differential forms is through line integrals such as \[\int_C P\,dx+Q\,dy.\] This can make it look like \(dx\) and \(dy\) are merely integration symbols. That is not the intrinsic definition.

Definition 9 (Differential \(k\)-form). A differential \(k\)-form on \(M\) is a smooth rule that assigns to each point \(p\in M\) an alternating multilinear map \[\omega_p:T_pM\times\cdots\times T_pM\to\mathbb{R}\] with \(k\) tangent-vector inputs.

At a single point \(p\), the vector space of alternating \(k\)-linear maps \[T_pM\times\cdots\times T_pM\to\mathbb{R}\] is denoted \[\Lambda^kT_p^*M.\] As \(p\) varies, these spaces assemble into a bundle \[\Lambda^kT^*M\to M.\] A smooth \(k\)-form is a smooth choice \[p\mapsto \omega_p\in \Lambda^kT_p^*M.\] We denote the vector space of smooth \(k\)-forms by \[\Omega^k(M).\] Later, after vector bundles are defined formally, we will write the same statement compactly as \[\Omega^k(M)=\Gamma(\Lambda^kT^*M),\] where \(\Gamma(E)\) means “smooth sections of the bundle \(E\).” Thus symbols like \(\Gamma(\Lambda^kT^*M)\) are not new mysterious objects; they are just the space of differential \(k\)-forms.

Special cases:

A \(0\)-form is a function \(f:M\to\mathbb{R}\).
A \(1\)-form is a covector field: \[\alpha_p:T_pM\to\mathbb{R}.\]
A \(2\)-form is an antisymmetric bilinear map: \[\omega_p:T_pM\times T_pM\to\mathbb{R}.\]
A \(k\)-form eats \(k\) tangent vectors and returns a number.

Only after we define forms as geometric objects do we integrate them over curves, surfaces, and manifolds.

One-forms

A one-form on \(M\) is written locally as \[\alpha=\alpha_\mu(x)\,dx^\mu.\] At each point \(p\), it eats a tangent vector \[v=v^\mu \frac{\partial}{\partial x^\mu}\] and gives \[\alpha_p(v)=\alpha_\mu(p)v^\mu.\]

Examples in physics:

Electromagnetic potential: \[A=A_\mu dx^\mu.\]
Berry connection: \[\mathcal A=\mathcal A_i dk^i.\]
Gradient of a function: \[df=\frac{\partial f}{\partial x^\mu}dx^\mu.\]

Two-forms and the wedge product

A two-form is locally written \[\omega=\frac12\omega_{\mu\nu}\,dx^\mu\wedge dx^\nu,\] where \[\omega_{\mu\nu}=-\omega_{\nu\mu}.\] The wedge product is antisymmetric: \[dx^\mu\wedge dx^\nu=-dx^\nu\wedge dx^\mu.\] In particular, \[dx^\mu\wedge dx^\mu=0.\] If \(\alpha\) is a \(p\)-form and \(\beta\) is a \(q\)-form, then \[\alpha\wedge\beta=(-1)^{pq}\beta\wedge\alpha.\]

Exterior derivative

The exterior derivative \[d:\Omega^k(M)\to\Omega^{k+1}(M)\] raises the degree of a form by one.

For a function \(f\), \[df=\frac{\partial f}{\partial x^\mu}dx^\mu.\] For a one-form \(\alpha=\alpha_\mu dx^\mu\), \[d\alpha=\frac{\partial \alpha_\nu}{\partial x^\mu}dx^\mu\wedge dx^\nu =\frac12(\partial_\mu\alpha_\nu-\partial_\nu\alpha_\mu)dx^\mu\wedge dx^\nu.\] The most important identity is \[d^2=0.\] In electromagnetism, if \[A=A_\mu dx^\mu,\] then the field strength is \[F=dA.\] In components, \[F_{\mu\nu}=\partial_\mu A_\nu-\partial_\nu A_\mu.\]

Integration of forms

Forms are not defined only by integration, but forms are exactly the objects that can be integrated naturally.

Integrating a one-form over a curve

Let \(\gamma:[0,1]\to M\) be a curve. Let \[\alpha=P(x,y)dx+Q(x,y)dy\] on \(\mathbb{R}^2\). The integral over \(\gamma(t)=(x(t),y(t))\) is \[\int_\gamma \alpha=\int_0^1 \alpha_{\gamma(t)}(\dot\gamma(t))\,dt.\] Since \[\dot\gamma(t)=\dot x(t)\frac{\partial}{\partial x}+\dot y(t)\frac{\partial}{\partial y},\] we get \[\alpha_{\gamma(t)}(\dot\gamma(t))=P(x(t),y(t))\dot x(t)+Q(x(t),y(t))\dot y(t).\] So \[\int_\gamma \alpha =\int_0^1\left[P(x(t),y(t))\dot x(t)+Q(x(t),y(t))\dot y(t)\right]dt.\] This is the familiar line integral.

The general definition of integrating a form

The sentence “a \(k\)-form can be integrated over a \(k\)-dimensional object” is not meant to be automatic from the fact that a form is a multilinear dual object. One still has to define the operation of integration. The definition uses three ingredients:

an orientation, which tells us what counts as a positive coordinate volume element;
pullback, which moves the form to a parameter domain in Euclidean space;
the ordinary multiple integral of a function against \(du^1\cdots du^k\).

First consider one parametrized \(k\)-dimensional piece. Let \[\sigma:U\subset \mathbb{R}^k\to M\] be a smooth parametrization, and let \(\omega\in\Omega^k(M)\). Pull back \(\omega\) to \(U\): \[\sigma^*\omega\in\Omega^k(U).\] Since \(U\subset\mathbb{R}^k\) has coordinates \((u^1,\ldots,u^k)\), every \(k\)-form on \(U\) has the form \[\sigma^*\omega=f(u^1,\ldots,u^k)\,du^1\wedge\cdots\wedge du^k.\] Then the integral over the parametrized piece is defined by \[\boxed{ \int_\sigma \omega := \int_U f(u^1,\ldots,u^k)\,du^1\cdots du^k. }\] Thus the wedge product is not itself the same thing as the measure \(du^1\cdots du^k\); rather, once a top-degree form is written as a coefficient times the oriented coordinate form, we integrate the coefficient by ordinary calculus.

For an oriented \(k\)-dimensional manifold \(N\) and a smooth map \[F:N\to M,\] one defines \[\int_N F^*\omega\] by choosing oriented charts on \(N\) and using a partition of unity. More explicitly, if \(\{(V_\alpha,\varphi_\alpha)\}\) is an oriented atlas for \(N\) and \(\{\rho_\alpha\}\) is a smooth partition of unity subordinate to the cover \(\{V_\alpha\}\), then \[\boxed{ \int_N F^*\omega = \sum_\alpha \int_{\varphi_\alpha(V_\alpha)} (\varphi_\alpha^{-1})^*(\rho_\alpha F^*\omega). }\] Each summand is an ordinary integral of a function on an open subset of \(\mathbb{R}^k\). This definition is independent of the oriented atlas and partition of unity. The independence is exactly the change-of-variables theorem from multivariable calculus.

If \(N\) is noncompact, the integral is automatically defined for compactly supported forms. For forms without compact support, convergence must be checked. In most physics examples in these notes, the domain is compact, such as \(S^1\), \(S^2\), or the Brillouin torus \(T^2\), or the form has suitable decay.

What integration of forms really means A differential form is first an alternating multilinear object on tangent vectors. To integrate it, we pull it back to a parameter domain, rewrite it as a coefficient times the standard oriented volume form, and then integrate that coefficient by ordinary calculus.

Integrating a two-form over a surface

Let \(\Sigma\) be a two-dimensional surface with local parameters \((u,v)\). A two-form can be integrated over \(\Sigma\) by pulling it back to the \((u,v)\) parameter domain and integrating the coefficient of \(du\wedge dv\).

This is the conceptual reason pullback matters: it converts forms on the target space into forms on the parameter space where we know how to integrate.

Pullback

Let \[f:M\to N\] be a smooth map. Pullback takes differential forms on \(N\) and produces differential forms on \(M\): \[f^*:\Omega^k(N)\to\Omega^k(M).\] For a function \(h:N\to\mathbb{R}\), \[f^*h=h\circ f.\] For a one-form, use the rule \[f^*(dy^i)=d(f^i)=\frac{\partial f^i}{\partial x^a}dx^a.\] Thus, if \[\alpha=\alpha_i(y)dy^i\] on \(N\), then \[f^*\alpha=\alpha_i(f(x))\frac{\partial f^i}{\partial x^a}dx^a.\]

Checkpoint: pullback in one line If a map is given by \(y^i=f^i(x)\), then replace every \(y^i\) by \(f^i(x)\) and every \(dy^i\) by \(d(f^i(x))\).

Pullback respects the two main operations: \[f^*(\alpha\wedge\beta)=f^*\alpha\wedge f^*\beta,\] \[f^*(d\alpha)=d(f^*\alpha).\]

Flows, Pullbacks, Pushforwards, and Lie Derivatives

Lie derivatives combine three earlier ideas: vector fields, flows, and pullback/pushforward. We first give the precise definitions and then allow ourselves the standard shorthand.

Flows of vector fields

Let \(X\in\Gamma(TM)\) be a smooth vector field. A flow is a family of maps that moves points according to \(X\).

Definition 10 (Local flow). A local flow of \(X\) is a smooth map \[\Phi:D\subset \mathbb{R}\times M\to M, \qquad (t,p)\mapsto \Phi_t(p),\] where \(D\) is an open set containing \(\{0\}\times M\), such that \[\Phi_0(p)=p\] and, for every fixed \(p\), the curve \[\gamma_p(t):=\Phi_t(p)\] satisfies \[\dot\gamma_p(t)=X_{\gamma_p(t)}.\] Using the rigorous velocity definition from the previous section, this condition means \[(\gamma_p)_{*t}\left(\left.\frac{d}{dt}\right|_t\right) = X_{\gamma_p(t)}.\] Equivalently, for every \(f\in C^\infty(M)\), \[\left.\frac{d}{dt}\right|_{t=t_0} f(\Phi_t(p)) = X_{\Phi_{t_0}(p)}[f].\]

For small enough \(t\), \(\Phi_t\) is a diffeomorphism onto its image. Where both sides are defined, \[\Phi_{t+s}=\Phi_t\circ\Phi_s, \qquad \Phi_0=\mathrm{id}_M.\] If the flow exists for all \(t\in\mathbb{R}\) and all \(p\in M\), the vector field is called complete. Compact manifolds often make vector fields complete, but compactness is not part of the definition.

In local coordinates, \[X=X^\mu(x)\partial_\mu,\] the flow equation becomes the ordinary differential equation \[\frac{d x^\mu(t)}{dt}=X^\mu(x(t)), \qquad x^\mu(0)=x^\mu(p).\] This coordinate ODE is a representation of the intrinsic equation \(\dot\gamma_p(t)=X_{\gamma_p(t)}\).

Conversely, a one-parameter family of local diffeomorphisms \(\Phi_t\) satisfying \(\Phi_0=\mathrm{id}_M\) defines a vector field by \[X_p= \left.\frac{d}{dt}\right|_{t=0}\Phi_t(p) := (c_p)_{*0}\left(\left.\frac{d}{dt}\right|_{0}\right),\] where \(c_p(t)=\Phi_t(p)\). Thus the slogan “differentiate the flow to get the vector field” means “push forward the canonical tangent vector \(d/dt\) along the trajectory curve.”

Pullback and pushforward under diffeomorphisms

A diffeomorphism \[\phi:M\to M\] acts on functions by pullback: \[(\phi^*f)(p)=f(\phi(p)).\] It acts on tangent vectors by pushforward: \[\phi_{*p}:T_pM\to T_{\phi(p)}M.\] For a one-form \(\alpha\), the pullback is defined by \[(\phi^*\alpha)_p(v)=\alpha_{\phi(p)}(\phi_{*p}v), \qquad v\in T_pM.\] For a \(k\)-form \(\omega\), \[(\phi^*\omega)_p(v_1,\ldots,v_k) = \omega_{\phi(p)}(\phi_{*p}v_1,\ldots,\phi_{*p}v_k).\]

A vector field can also be pulled back by a diffeomorphism, but one must use the inverse pushforward: \[(\phi^*Y)_p = (\phi^{-1})_{*\phi(p)}\bigl(Y_{\phi(p)}\bigr) \in T_pM.\] Equivalently, \[(\phi^*Y)(f)=\phi^*\left(Y\bigl((\phi^{-1})^*f\bigr)\right).\] This formula is often the most useful one in proofs.

Why pullback for forms but inverse pushforward for vector fields A one-form at \(\phi(p)\) can eat the pushed-forward vector \(\phi_*v\). Hence forms pull back directly. A vector field value \(Y_{\phi(p)}\) lives at \(\phi(p)\), but a pulled-back vector field at \(p\) must live in \(T_pM\), so we use \((\phi^{-1})_*\). This is why vector-field pullback requires \(\phi\) to be a diffeomorphism.

Tensor fields and notation

A type \((r,s)\) tensor at \(p\) is an element of \[(T_pM)^{\otimes r}\otimes (T_p^*M)^{\otimes s}.\] A smooth tensor field is a smooth assignment of such a tensor to every point. Examples are: \[\text{functions }(0,0), \quad \text{vector fields }(1,0), \quad \text{one-forms }(0,1), \quad \text{metrics }(0,2).\] A differential \(k\)-form is an antisymmetric type \((0,k)\) tensor field. After vector bundles are introduced, we write \[\Omega^k(M):=\Gamma(\Lambda^kT^*M)\] for the space of smooth \(k\)-forms.

General definition of the Lie derivative

Let \(X\) be a vector field with local flow \(\Phi_t\). If \(T\) is an ordinary tensor field on \(M\), define \[\boxed{ \mathcal{L}_XT= \left.\frac{d}{dt}\right|_{t=0}\Phi_t^*T. }\] For forms this uses the usual pullback. For vector fields and mixed tensors, \(\Phi_t^*\) uses inverse pushforwards in the vector slots, as above.

The Lie derivative is canonical: it uses only the smooth structure and the vector field \(X\). It does not use a metric or a connection.

Functions

For \(f\in C^\infty(M)\), \[\mathcal{L}_Xf = \left.\frac{d}{dt}\right|_{0} f(\Phi_t(p)) = X_p[f].\] In coordinates, \[\mathcal{L}_X f=X^\mu\partial_\mu f.\]

Vector fields and the Lie bracket

For a vector field \(Y\), \[(\Phi_t^*Y)_p=(\Phi_{-t})_{*\Phi_t(p)}Y_{\Phi_t(p)}.\] The Lie derivative is \[\mathcal{L}_XY= \left.\frac{d}{dt}\right|_{0}\Phi_t^*Y.\] We now derive the standard formula \[\boxed{\mathcal{L}_XY=[X,Y].}\]

Let \(f\in C^\infty(M)\). Using the function-action formula for pullback of a vector field, \[(\Phi_t^*Y)(f) = \Phi_t^*\left(Y(\Phi_{-t}^*f)\right).\] Differentiate at \(t=0\): \[\begin{align*} (\mathcal{L}_XY)(f) &=\left.\frac{d}{dt}\right|_{0} \Phi_t^*\left(Y(\Phi_{-t}^*f)\right)\\ &=X[Y(f)] + Y\left(\left.\frac{d}{dt}\right|_0 \Phi_{-t}^*f\right)\\ &=X[Y(f)]-Y[X(f)]. \end{align*}\] Therefore \[\mathcal{L}_XY=[X,Y], \qquad [X,Y](f):=X(Y(f))-Y(X(f)).\] In local coordinates, \[[X,Y] = \left(X^\nu\partial_\nu Y^\mu-Y^\nu\partial_\nu X^\mu\right)\partial_\mu.\]

Geometric meaning of \([X,Y]\) The bracket measures the infinitesimal failure of the two flows to commute. In local coordinates, move a small time \(\epsilon\) first along \(X\) and then along \(Y\): \[x^\mu\mapsto x^\mu+ \epsilon X^\mu+ \epsilon Y^\mu+ \epsilon^2 X^\nu\partial_\nu Y^\mu+O(\epsilon^3).\] Moving first along \(Y\) and then along \(X\) gives \[x^\mu\mapsto x^\mu+ \epsilon Y^\mu+ \epsilon X^\mu+ \epsilon^2 Y^\nu\partial_\nu X^\mu+O(\epsilon^3).\] The endpoint difference is \[\epsilon^2\left(X^\nu\partial_\nu Y^\mu-Y^\nu\partial_\nu X^\mu\right)+O(\epsilon^3),\] which is \(\epsilon^2[X,Y]^\mu+O(\epsilon^3)\). Thus the bracket is an area-order, not length-order, measure of noncommutativity.

Differential forms and Cartan’s formula

For a \(k\)-form \(\omega\), \[\mathcal{L}_X\omega= \left.\frac{d}{dt}\right|_{0}\Phi_t^*\omega.\] The interior product \(\iota_X\omega\) is the \((k-1)\)-form obtained by inserting \(X\) into the first slot: \[(\iota_X\omega)(Y_1,\ldots,Y_{k-1}) = \omega(X,Y_1,\ldots,Y_{k-1}).\]

Theorem 1 (Cartan formula). For every differential form \(\omega\), \[\boxed{\mathcal{L}_X\omega=\iota_X d\omega+d(\iota_X\omega).}\]

Derivation. Both sides are derivations of degree zero on the exterior algebra of forms and obey the same product rule with respect to the wedge product. Therefore it is enough to check functions and one-forms.

If \(f\) is a function, then \(\iota_X f=0\), so \[\iota_Xdf+d(\iota_Xf)=df(X)=X[f]=\mathcal{L}_Xf.\] Now let \(\alpha\) be a one-form and let \(Y\) be a vector field. From the definition of Lie derivative and the fact that vector fields pull back by inverse pushforward, one obtains \[(\mathcal{L}_X\alpha)(Y)=X[\alpha(Y)]-\alpha([X,Y]).\] On the other hand, \[\begin{align*} (\iota_Xd\alpha+d(\iota_X\alpha))(Y) &=d\alpha(X,Y)+Y[\alpha(X)]\\ &=X[\alpha(Y)]-Y[\alpha(X)]-\alpha([X,Y])+Y[\alpha(X)]\\ &=X[\alpha(Y)]-\alpha([X,Y]). \end{align*}\] Thus the two sides agree on one-forms. Since both sides are compatible with wedge products, they agree on all forms. ◻

For a one-form \(\alpha=\alpha_\mu dx^\mu\), \[(\mathcal{L}_X\alpha)_\mu = X^\nu\partial_\nu\alpha_\mu+ \alpha_\nu\partial_\mu X^\nu.\] For a coordinate one-form, \[\mathcal{L}_X(dx^\mu)=d(X^\mu).\]

What the Lie derivative is not

The Lie derivative is a canonical operation on ordinary tensor fields on \(M\). It is not automatically defined for a section of an arbitrary vector bundle \(E\to M\). To Lie-differentiate such a section, the flow of \(X\) on \(M\) must be lifted to a flow on \(E\), which is extra structure. A connection, introduced later, is the standard tool for differentiating sections of arbitrary vector bundles.

Canonical Vector Bundles on a Manifold

The tangent bundle

For each point \(p\in M\), there is a tangent space \(T_pM\). Collect all tangent spaces: \[TM:=\bigsqcup_{p\in M}T_pM.\] The symbol \(\bigsqcup\) means disjoint union: even if two tangent vectors have the same coordinate components, they are regarded as different if they live at different base points.

The projection is \[\pi:TM\to M, \qquad \pi(v_p)=p.\] This is the tangent bundle.

Important sentence The tangent bundle \(TM\to M\) is a rank-\(n\) vector bundle over an \(n\)-dimensional manifold \(M\).

A vector field is a smooth choice of one tangent vector at each point: \[X(p)\in T_pM.\] Therefore a vector field is a map \[X:M\to TM\] satisfying \[\pi\circ X=\mathrm{id}_M.\] Such a map is called a section.

Vector field = section A vector field is not the tangent bundle itself. It is a section of the tangent bundle. \[\text{vector field }X \in \Gamma(TM).\]

The cotangent bundle

Similarly, collect the cotangent spaces: \[T^*M:=\bigsqcup_{p\in M}T_p^*M.\] This is the cotangent bundle.

A one-form is a smooth choice of one covector at each point: \[\alpha(p)\in T_p^*M.\] So a one-form is a section of the cotangent bundle: \[\alpha\in\Gamma(T^*M).\]

More generally, \(k\)-forms are sections of the bundle \[\Lambda^kT^*M\to M.\]

Precise language \(TM\), \(T^*M\), and \(\Lambda^kT^*M\) are vector bundles over \(M\). Vector fields and differential forms are sections of these bundles, not the bundles themselves.

Lie Groups from the Differential-Geometric Point of View

Before discussing principal bundles and gauge fields, we need Lie groups. In physics, Lie groups appear as symmetry groups and gauge groups. In bundle theory, they also appear as structure groups: the groups that act on fibres and describe how local trivializations are glued together.

The goal of this section is to connect the following objects without treating any of them as unexplained notation: \[G, \qquad T_eG, \qquad \mathfrak g, \qquad [\cdot,\cdot], \qquad \exp, \qquad \mathop{\mathrm{Ad}}_g, \qquad \mathop{\mathrm{ad}}_X.\]

Topological groups and Lie groups

Definition 14 (Topological group). A topological group is a group \(G\) equipped with a topology such that multiplication and inversion are continuous maps: \[G\times G\to G, \qquad (g,h)\mapsto gh,\] \[G\to G, \qquad g\mapsto g^{-1}.\]

Definition 15 (Lie group). A Lie group is a group \(G\) which is also a smooth manifold, such that multiplication and inversion are smooth maps: \[G\times G\to G, \qquad (g,h)\mapsto gh,\] \[G\to G, \qquad g\mapsto g^{-1}.\]

Thus a Lie group is simultaneously an algebraic object and a smooth manifold. Its points are group elements, and the group operations can be differentiated.

Basic Lie groups

\((\mathbb{R},+)\) is a one-dimensional Lie group.
\(U(1)=\{e^{i\theta}:\theta\in\mathbb{R}\}\) is a one-dimensional Lie group. As a manifold it is \(S^1\).
\(\mathop{\mathrm{GL}}(n,\mathbb{R})\) is the group of invertible real \(n\times n\) matrices. It is an open subset of \(M_n(\mathbb{R})\cong\mathbb{R}^{n^2}\).
\(SO(n)=\{A\in \mathop{\mathrm{GL}}(n,\mathbb{R}):A^TA=I,\det A=1\}\) is the rotation group.
\(U(n)=\{U\in\mathop{\mathrm{GL}}(n,\mathbb{C}):U^\dagger U=I\}\) is the unitary group.
\(SU(n)=\{U\in U(n):\det U=1\}\) is the special unitary group.

Finite groups can be regarded as zero-dimensional Lie groups with the discrete topology. The differential-geometric content is richest for continuous groups such as \(U(1)\), \(SU(2)\), and \(SO(3)\).

The Lie algebra as a tangent space

Let \(G\) be a Lie group and let \(e\in G\) be the identity element.

Definition 16 (Lie algebra as a vector space). The Lie algebra of \(G\) is the tangent space at the identity: \[\mathfrak g:=T_eG.\]

At this stage, \(\mathfrak g\) is only a vector space. Its elements are infinitesimal group elements: velocities of smooth curves in \(G\) passing through the identity. If \[\gamma:(-\epsilon,\epsilon)\to G, \qquad \gamma(0)=e,\] then \[X=\dot\gamma(0)\in T_eG=\mathfrak g.\]

For matrix Lie groups, this becomes concrete. If \(G\subset \mathop{\mathrm{GL}}(n,\mathbb{C})\) is a matrix Lie group, a tangent vector \(X\in T_IG\) is represented by an ordinary matrix derivative \[X=\left.\frac{d}{dt}\right|_{0}\gamma(t), \qquad \gamma(t)\in G, \quad \gamma(0)=I.\] For \(U(n)\), differentiating \(\gamma(t)^\dagger\gamma(t)=I\) at \(t=0\) gives \[X^\dagger+X=0,\] so \[\mathfrak u(n)=\{X\in M_n(\mathbb{C}):X^\dagger=-X\}.\] For \(SO(n)\), \[\mathfrak{so}(n)=\{X\in M_n(\mathbb{R}):X^T=-X\}.\] For \(SU(n)\), \[\mathfrak{su}(n)=\{X\in M_n(\mathbb{C}):X^\dagger=-X,\ \mathrm{Tr}X=0\}.\]

The bracket on vector fields

Before defining the bracket on a Lie algebra, recall the bracket of vector fields on an arbitrary smooth manifold \(M\).

Definition 17 (Lie bracket of vector fields). For vector fields \(X,Y\in\Gamma(TM)\), the Lie bracket \([X,Y]\) is the vector field defined by \[[X,Y](f)=X(Y(f))-Y(X(f))\] for every \(f\in C^\infty(M)\).

This is exactly the same operation as the Lie derivative of \(Y\) along \(X\): \[\boxed{[X,Y]=\mathcal{L}_XY.}\] Therefore the vector-field bracket is not an additional arbitrary operation. It is the infinitesimal change of \(Y\) under the flow of \(X\), written in derivation form.

In local coordinates, \[X=X^\mu\partial_\mu, \qquad Y=Y^\mu\partial_\mu,\] one obtains \[[X,Y] = \left(X^\nu\partial_\nu Y^\mu-Y^\nu\partial_\nu X^\mu\right)\partial_\mu.\]

From vector-field bracket to Lie-algebra bracket

Now let \(G\) be a Lie group. For every \(g\in G\), left translation is the diffeomorphism \[L_g:G\to G, \qquad L_g(h)=gh.\] Given \(X\in\mathfrak g=T_eG\), define a vector field \(X^L\) on \(G\) by \[(X^L)_g=(dL_g)_eX.\] This is called the left-invariant vector field generated by \(X\).

Definition 18 (Left-invariant vector field). A vector field \(V\in\Gamma(TG)\) is left-invariant if \[(dL_g)_h(V_h)=V_{gh}\] for all \(g,h\in G\).

Every \(X\in\mathfrak g\) determines exactly one left-invariant vector field \(X^L\), and every left-invariant vector field is obtained this way by evaluating at \(e\).

The bracket of two left-invariant vector fields is again left-invariant. Indeed, diffeomorphisms preserve Lie brackets: \[(L_g)_*[V,W]=[(L_g)_*V,(L_g)_*W].\] If \(V\) and \(W\) are left-invariant, the right-hand side is \([V,W]\), so \([V,W]\) is left-invariant.

Definition 19 (Lie algebra bracket). For \(X,Y\in\mathfrak g\), define \([X,Y]\in\mathfrak g\) by \[[X^L,Y^L]=[X,Y]^L.\] Equivalently, \[[X,Y]=[X^L,Y^L]_e.\]

This answers an important conceptual question: there is a big bracket operation on all vector fields on \(G\), and the Lie-algebra bracket is its restriction to left-invariant vector fields, followed by evaluation at the identity. They are not two unrelated definitions.

What the bracket remembers The tangent space \(T_eG\) records infinitesimal directions away from the identity. The bracket records how these infinitesimal motions fail to commute. Thus the Lie algebra is \[(\mathfrak g,[\cdot,\cdot]),\] not merely the vector space \(T_eG\).

Matrix Lie groups: proof of the commutator formula

For a matrix Lie group, the abstract bracket becomes \[\boxed{[X,Y]=XY-YX.}\] Here is a direct proof.

Let \(G\subset \mathop{\mathrm{GL}}(n,\mathbb{R})\) or \(\mathop{\mathrm{GL}}(n,\mathbb{C})\) be a matrix Lie group. Left translation by \(g\) is matrix multiplication \(h\mapsto gh\), so \[(X^L)_g=gX.\] In the ambient vector space of matrices, the vector field \(X^L\) is the map \[g\mapsto gX.\] Similarly, \[Y^L_g=gY.\] The derivative of the matrix-valued function \(g\mapsto gY\) in the direction \(gX\) is \[(gX)Y=gXY.\] The derivative of \(g\mapsto gX\) in the direction \(gY\) is \[(gY)X=gYX.\] Therefore \[[X^L,Y^L]_g=gXY-gYX=g(XY-YX).\] This is exactly the left-invariant vector field generated by \(XY-YX\). Hence \[[X,Y]=XY-YX.\]

What the group commutator loop measures

The product \[K(t,s)=\exp(tX)\exp(sY)\exp(-tX)\exp(-sY)\] is called a group commutator. If the group were Abelian, then \(K(t,s)=e\) exactly. For a non-Abelian Lie group, \(K(t,s)\) measures the failure of the small motions \(\exp(tX)\) and \(\exp(sY)\) to commute.

For a matrix group, Taylor expansion gives \[\exp(tX)=I+tX+O(t^2), \qquad \exp(sY)=I+sY+O(s^2).\] Keeping only the terms proportional to \(ts\), \[K(t,s)=I+ts(XY-YX)+O(t^2s,ts^2).\] Thus the Lie bracket is the first nonzero term in the commutator loop. This computation is not a separate definition of the bracket; it is a concrete way to see that the bracket is infinitesimal noncommutativity.

Structure constants

If \(\{e_a\}\) is a basis of \(\mathfrak g\), the bracket is determined by numbers \(f_{ab}{}^c\) defined by \[[e_a,e_b]=f_{ab}{}^c e_c.\] These are the structure constants in the chosen basis. They change under a change of basis, but the abstract bracket does not.

Mathematical and physical generator conventions

Mathematicians usually define \[\mathfrak u(n)=\{X:X^\dagger=-X\},\] so elements of \(\mathfrak u(n)\) are anti-Hermitian. This is natural because exponentials of anti-Hermitian matrices are unitary.

Physicists often write unitary transformations as \[U(\theta)=\exp(-i\theta^a Q_a),\] where the \(Q_a\) are Hermitian quantum operators. In that convention, the anti-Hermitian Lie algebra element is \[X=-i\theta^a Q_a.\] Thus Hermitian operators enter because of unitary representations on Hilbert space, not because the abstract Lie algebra of \(U(n)\) is made of Hermitian matrices.

If \[\rho:G\to U(\mathcal H)\] is a unitary representation, then its differential \[d\rho_e:\mathfrak g\to \mathfrak u(\mathcal H)\] sends Lie algebra elements to anti-Hermitian operators. Physicists then write \[d\rho_e(X)=-iQ_X\] with \(Q_X\) Hermitian.

The exponential map and one-parameter subgroups

A one-parameter subgroup of \(G\) is a smooth homomorphism \[\gamma:\mathbb{R}\to G, \qquad \gamma(t+s)=\gamma(t)\gamma(s).\] Every \(X\in\mathfrak g\) determines a unique one-parameter subgroup \(\gamma_X\) satisfying \[\gamma_X(0)=e, \qquad \dot\gamma_X(0)=X.\] The exponential map is \[\boxed{\exp(X)=\gamma_X(1),} \qquad \gamma_X(t)=\exp(tX).\] For matrix Lie groups this is the usual matrix exponential.

Is \(\exp(tX)\) a flow?

The curve \[t\mapsto \exp(tX)\] is a path in the Lie group \(G\). It is not, by itself, a flow on an arbitrary manifold. A flow is a family of maps from a manifold to itself.

However, \(\exp(tX)\) becomes a flow once \(G\) acts on something.

First, \(G\) acts on itself by right multiplication. The left-invariant vector field \(X^L\) has flow \[\Phi_t^{X^L}(h)=h\exp(tX).\] Indeed, \[\left.\frac{d}{dt}\right|_{0}h\exp(tX)=(dL_h)_eX=(X^L)_h.\] So \(t\mapsto \exp(tX)\) is the integral curve of \(X^L\) starting at \(e\), and \(h\mapsto h\exp(tX)\) is the full flow of \(X^L\).

Second, if \(G\) acts smoothly on a manifold \(M\) by a left action \[G\times M\to M, \qquad (g,p)\mapsto g\cdot p,\] then \(X\in\mathfrak g\) induces the fundamental vector field \[X_M(p)=\left.\frac{d}{dt}\right|_{0}\exp(tX)\cdot p.\] Its flow is \[\Phi_t^{X_M}(p)=\exp(tX)\cdot p.\]

Exponential versus flow \(\exp(tX)\) is a one-parameter subgroup in \(G\). It becomes a flow after \(G\) acts on a manifold. On \(G\) itself, multiplication turns it into a flow. On a representation space or physical configuration space, the group action turns it into the corresponding symmetry flow.

Right-invariant vector fields and the minus sign

Right translation is \[R_g:G\to G, \qquad R_g(h)=hg.\] For \(X\in\mathfrak g\), define the right-invariant vector field \[(X^R)_g=(dR_g)_eX.\] For a matrix group, \[(X^R)_g=Xg.\] Its flow is \[\Psi_t^{X^R}(h)=\exp(tX)h.\]

The bracket has the opposite sign: \[\boxed{[X^R,Y^R]=-[X,Y]^R.}\] For matrix groups this is immediate. The derivative of \(g\mapsto Yg\) in the direction \(Xg\) is \(YXg\), while the derivative of \(g\mapsto Xg\) in the direction \(Yg\) is \(XYg\). Therefore \[[X^R,Y^R]_g=YXg-XYg=-(XY-YX)g=-[X,Y]^R_g.\] The sign is not a mistake; it comes from using right-invariant rather than left-invariant vector fields. Principal bundles usually use right actions, so this sign convention is one reason inverse adjoint actions such as \(\mathop{\mathrm{Ad}}_{g^{-1}}\) appear naturally.

The adjoint action \(\mathop{\mathrm{Ad}}_g\)

For \(g\in G\), conjugation by \(g\) is the diffeomorphism \[C_g:G\to G, \qquad C_g(h)=ghg^{-1}.\] It fixes the identity: \(C_g(e)=e\). Therefore its differential at the identity is a linear map \[(dC_g)_e:T_eG\to T_eG.\]

Definition 20 (Adjoint action). The adjoint action of \(G\) on \(\mathfrak g\) is \[\mathop{\mathrm{Ad}}_g:=(dC_g)_e:\mathfrak g\to\mathfrak g.\]

This is a pushforward: it is the tangent map of the conjugation diffeomorphism at the identity.

For matrix groups, \[\boxed{\mathop{\mathrm{Ad}}_gX=gXg^{-1}.}\] Proof: take a curve \(\gamma(t)\) in \(G\) with \(\gamma(0)=I\) and \(\dot\gamma(0)=X\). Then \[C_g(\gamma(t))=g\gamma(t)g^{-1}.\] Differentiating at \(t=0\) gives \[\left.\frac{d}{dt}\right|_0 g\gamma(t)g^{-1}=gXg^{-1}.\] Thus \(\mathop{\mathrm{Ad}}_{g^{-1}}X=g^{-1}Xg\).

The infinitesimal adjoint action \(\mathop{\mathrm{ad}}_X\)

The adjoint action itself is a smooth map \[\mathop{\mathrm{Ad}}:G\to \mathop{\mathrm{GL}}(\mathfrak g), \qquad g\mapsto \mathop{\mathrm{Ad}}_g.\] Differentiating this map at the identity gives \[(d\mathop{\mathrm{Ad}})_e:T_eG\to T_I\mathop{\mathrm{GL}}(\mathfrak g).\] Since \(T_eG=\mathfrak g\) and \(T_I\mathop{\mathrm{GL}}(\mathfrak g)\cong\mathrm{End}(\mathfrak g)\), each \(X\in\mathfrak g\) gives a linear map \[\mathop{\mathrm{ad}}_X:\mathfrak g\to\mathfrak g.\]

Definition 21 (Infinitesimal adjoint action). For \(X,Y\in\mathfrak g\), \[\mathop{\mathrm{ad}}_X(Y)=\left.\frac{d}{dt}\right|_{0}\mathop{\mathrm{Ad}}_{\exp(tX)}Y.\]

The derivative here is an ordinary derivative of a curve in the vector space \(\mathfrak g\): for fixed \(Y\), the map \[t\mapsto \mathop{\mathrm{Ad}}_{\exp(tX)}Y\] is a curve in \(\mathfrak g\).

With the left-invariant convention, \[\boxed{\mathop{\mathrm{ad}}_X(Y)=[X,Y].}\] For matrix groups, \[\mathop{\mathrm{Ad}}_{\exp(tX)}Y=e^{tX}Ye^{-tX}.\] Using \[e^{tX}=I+tX+O(t^2), \qquad e^{-tX}=I-tX+O(t^2),\] we get \[e^{tX}Ye^{-tX}=Y+t(XY-YX)+O(t^2).\] Therefore \[\mathop{\mathrm{ad}}_X(Y)=XY-YX=[X,Y].\]

Examples: \(U(1)\), \(SU(2)\), and \(SO(3)\)

\(U(1)\)

The group \(U(1)\) is \[U(1)=\{e^{i\theta}:\theta\in\mathbb{R}\}.\] Its Lie algebra is \[\mathfrak u(1)=i\mathbb{R}.\] Because \(U(1)\) is Abelian, \[[X,Y]=0\] for all \(X,Y\in\mathfrak u(1)\). The exponential map \[\exp:i\mathbb{R}\to U(1)\] has kernel \(2\pi i\mathbb{Z}\), showing that the Lie algebra sees the local line while the Lie group is globally a circle.

\(SU(2)\)

The group \(SU(2)\) consists of unitary \(2\times2\) matrices with determinant one. A mathematical basis of \(\mathfrak{su}(2)\) is \[e_a=-\frac{i}{2}\sigma_a,\] where \(\sigma_a\) are the Pauli matrices. Then \[[e_a,e_b]=\epsilon_{ab}{}^c e_c.\] A common physics basis is \[T_a=\frac12\sigma_a,\] with \[[T_a,T_b]=i\epsilon_{ab}{}^cT_c.\] The two conventions differ by the factor \(-i\).

\(SO(3)\)

The group \(SO(3)\) consists of rotations of \(\mathbb{R}^3\). Its Lie algebra is \[\mathfrak{so}(3)=\{X\in M_3(\mathbb{R}):X^T=-X\}.\] As Lie algebras, \[\mathfrak{su}(2)\cong\mathfrak{so}(3).\] But \(SU(2)\) and \(SO(3)\) are globally different Lie groups: \(SU(2)\) is the double cover of \(SO(3)\). Thus they have the same infinitesimal algebra but different global topology. This distinction is what allows spin-\(1/2\) representations to be linear for \(SU(2)\) but projective for \(SO(3)\).

Lie groups as structure groups

In a fibre bundle, a structure group tells us how local product descriptions are glued together. For a rank-\(r\) real vector bundle the natural structure group is \(\mathop{\mathrm{GL}}(r,\mathbb{R})\). For a complex Hermitian vector bundle it is often \(U(r)\). For gauge theory it is usually a Lie group such as \[U(1),\quad SU(2),\quad SU(N),\quad SO(N).\] The Lie group gives finite gauge transformations. The Lie algebra gives infinitesimal gauge fields and field strengths.

Structure group, frame bundle, and principal bundle

There are two equivalent ways to organize the same data.

The bottom-up viewpoint starts with a vector bundle \[\pi:E\to M\] whose fibre is a model vector space \(V\). On overlaps, local trivializations are glued by transition functions \[g_{ij}:U_i\cap U_j\to \mathop{\mathrm{GL}}(V).\] If the transition functions can be chosen to land in a subgroup \[G\subseteq \mathop{\mathrm{GL}}(V),\] then one says that \(E\) has structure group \(G\), or that the structure group has been reduced to \(G\). In this language, the structure group is not an additional fibre sitting over \(M\). It is the allowed class of gluing maps for the vector fibres.

For example, a complex rank-\(N\) vector bundle has a priori structure group \(\mathop{\mathrm{GL}}(N,\mathbb{C})\). If it is equipped with a Hermitian inner product and the local frames are required to be orthonormal, then transition functions preserve the Hermitian inner product, so they lie in \[U(N)\subset \mathop{\mathrm{GL}}(N,\mathbb{C}).\] Thus a Hermitian vector bundle naturally has structure group \(U(N)\).

The top-down viewpoint starts instead with a principal \(G\)-bundle \[P\to M.\] This principal bundle is the bundle of frames, gauges, or local reference systems. Once \(G\) acts on a vector space \(V\) through a representation \[\rho:G\to \mathop{\mathrm{GL}}(V),\] one obtains the associated vector bundle \[P\times_G V.\]

The bridge between the two viewpoints is the frame bundle. Suppose \(E\to M\) is a rank-\(r\) real vector bundle. Its full frame bundle is \[\mathop{\mathrm{Fr}}(E)=\bigsqcup_{x\in M}\mathop{\mathrm{Fr}}(E_x),\] where \[\mathop{\mathrm{Fr}}(E_x)=\{u:\mathbb{R}^r\to E_x\; |\; u\text{ is a linear isomorphism}\}.\] A frame \(u\) is an ordered basis of \(E_x\), encoded as a linear isomorphism from the model fibre to the actual fibre. The group \(\mathop{\mathrm{GL}}(r,\mathbb{R})\) acts on the right by \[(u\cdot g)(v):=u(gv), \qquad v\in\mathbb{R}^r, \quad g\in\mathop{\mathrm{GL}}(r,\mathbb{R}).\] This action changes the frame but not the base point. With this right action, \[\mathop{\mathrm{Fr}}(E)\to M\] is a principal \(\mathop{\mathrm{GL}}(r,\mathbb{R})\)-bundle.

If \(E\) is a Hermitian complex vector bundle of rank \(N\), the unitary frame bundle is \[\mathop{\mathrm{Fr}}_U(E)=\bigsqcup_{x\in M}\mathop{\mathrm{Fr}}_U(E_x),\] where \(\mathop{\mathrm{Fr}}_U(E_x)\) is the set of unitary isomorphisms \[u:\mathbb{C}^N\to E_x.\] It is a principal \(U(N)\)-bundle. Conversely, \[E\cong \mathop{\mathrm{Fr}}_U(E)\times_{U(N)}\mathbb{C}^N.\]

Structure group versus principal bundle A vector bundle with structure group \(G\) and a principal \(G\)-bundle are not competing ideas. The vector-bundle description says: “the fibres are vector spaces and are glued by \(G\)-valued transition functions.” The principal-bundle description says: “collect all allowed local frames into a bundle whose fibre is acted on freely and transitively by \(G\).” Passing to the frame bundle and passing to an associated vector bundle are inverse constructions up to natural isomorphism.

This is why physics often moves between the two languages without warning. When calculating wavefunctions, expectation values, and Hamiltonians, the vector bundle is natural because matter fields are sections of vector bundles. When studying gauge fields, curvature, Chern classes, and monopoles, the principal bundle is often cleaner because the connection is fundamentally a rule for moving frames.

What to remember from Lie groups

\(G\) is the finite group of transformations.
\(\mathfrak g=T_eG\) is the vector space of infinitesimal transformations.
\([X,Y]\) is induced from the Lie derivative bracket of left-invariant vector fields.
For matrix groups, \([X,Y]=XY-YX\).
\(\mathop{\mathrm{Ad}}_g=(dC_g)_e\) is finite conjugation on infinitesimal generators.
\(\mathop{\mathrm{ad}}_X=(d\mathop{\mathrm{Ad}})_e(X)\) is infinitesimal conjugation, and \(\mathop{\mathrm{ad}}_XY=[X,Y]\).