Category Archives: General Mathematics

First Order Differential Operators

I thought I would share some interesting things about first order differential operators, acting on functions on a supermanifold. One can reduce the theory to operators on manifolds by simply dropping the sign factors and ignoring the parity.

First order differential operators naturally include vector fields as their homogeneous “top component”.  The lowest order component is left multiplication by a smooth function.   I will attempt to demonstrate that  from an algebraic point of view first order differential operators  are quite natural and in some sense more fundamental that just the vector fields.

Geometrically, vector fields are key as they represent infinitesimal diffeomorphisms and are used to construct Lie derivatives as “geometric variations”.  This is probably why in introductory geometry textbooks first order differential operators are not described.

I do not think anything I am about to say is in fact new.  I assume the reader has some idea what a differential operator is and that they form a Lie algebra under the commutator bracket.  Everything here will be done on supermanifolds.

I won’t present full proofs, hopefully anyone interested can fill in any gaps.  Any serious mistakes then let me know.

Let \(M\) be a supermanifold and let \(C^{\infty}(M)\) denote its algebra of functions.

Definition A differential operator \(D\) is said to be a first order differential operator if and only if

\(\left[  \left[ D,f \right],g \right]1=0\),

for all \(f,g \in C^{\infty}(M)\).

We remark that we have a filtration here rather than a grading (nothing to do with the supermanifold grading) as we include zero order operators here (left multiplication by a function).

Let us denote the vector  space of  first order differential operators as \(\mathcal{D}^{1}(M)\).

Theorem The first order differential operator  \(D \in\mathcal{D}^{1}(M) \) is a vector field if and only if \(D(1)=0\).

Proof Writing out the definition of a first order differential operator gives

\(D(f,g) = D(f)g + (-1)^{\widetilde{D}\widetilde{f}}f D(g)- D(1)fg\),

which reduces to the strict Leibniz rule when \(D(1)=0\).  QED.

Lemma First order differential operators always decompose as

\(D = (D-D(1)) + D(1)\).

The above lemma says that we can write any first order differential operator as the sum of a vector field and a function.

Theorem A first order differential operator \(D\) is a zero order operator if and only if \(D(1) \neq 0\) and

\(\left[  D,f\right]1 = 0\),

for all \(f \in C^{\infty}(M)\).

Proof Writing out the definition of a first order differential operator and using the above Lemma we get

\(\left[  D,f\right]1 =  (D(f) {-} D(1)f) { -} (-1)^{\widetilde{D}\widetilde{f}}f (D {-} D(1)) =0\).

Thus we decompose the condition into the sum of a function and a vector field.  As theses are different they must both vanish separately.  In particular \(D- D(1)\) must be the zero vector. Then \(D = D(1)\) and we have “just” a non-zero function.  QED

We assume that the function is not zero, otherwise we can simply consider it to be the zero vector.  This avoids the obvious “degeneracy”.

Theorem The space of first order differential operators \(D \in\mathcal{D}^{1}(M) \) is a bimodule over \(C^{\infty}(M)\).

Proof Let \(D\) be a first order differential operator and let \(k,l \in C^{\infty}(M)\)  be functions. Then using all the definitions one arrives at

\(kDl = k \left(  (-1)^{\widetilde{l} \widetilde{D}}(D- D(1))   + D(l) \right)\),

which clearly shows that we have a first order differential operator. QED

Please note that this is different to the case of vector fields, they only form a left module. That is \(f \circ X\) is a vector field but \(X \circ  f\) is not.

Theorem The space of first order differential operators is a Lie algebra with respect to the commutator bracket.

Proof Let us assume the basic results for the commutator. That is we take for granted that is forms a Lie algebra. The non-trivial thing is that the space of first order differential operators is closed with respect to the commutator. By the definitons we get

\(\left[ D_{1}, D_{2}  \right] = \left[(D_{1}-D_{1}(1)) , (D_{2} – D_{2}(1))  \right] + (D_{1}-D_{1}(1))(D_{2}(1)){ -} (-1)^{\widetilde{D_{1}} \widetilde{D}_{2}} (D_{2}- D_{2}(1)) (D_{1}(1))\),

which remains a first order differential operator. QED

Note that the above commutator contains the standard Lie bracket between vector fields.  So as one expects vector fields are closed with respect to the commutator.

The commutator bracket between first order differential operators is often known as THE Jacobi bracket.

So in conclusion we see that the first order differential operators have a privileged place in geometry. They form a bimodule over the smooth functions and are closed with respect to the commutator.  No other order differential operators have these properties.

They are also very important from other angles including Jacobi algebroids and related structures like Courant algebroids and generalised geometry. But these remain topics for discussion another day.

The fundamental misunderstanding of calculus

We all know the fundamental theorems of calculus, if not check Wikipedia.  I now want to  demonstrate what has been called the fundamental misunderstanding of calculus.

Let us consider the two dimensional plane and equip it with coordinates \((x,y)\).  Associated with this choice of coordinates are  the partial derivatives

\(\left( \frac{\partial}{\partial x} , \frac{\partial}{\partial y} \right)\).

You can think about these in terms of the tangent sheaf etc. if so desired, but we will keep things quite simple.

Now let us consider a change of coordinates. We will be quite specific here for illustration purposes

\(x \rightarrow \bar{x} = x +y\),

\(y \rightarrow \bar{y} = y\).

Now think about how these effect the partial derivatives. This is really just a simple change of variables.  Let me now state  the fundamental misunderstanding of  calculus in a way suited to our example:

Misunderstanding: Despite coordinate x changing the partial derivative with respect to x remains unchanged. Despite the coordinate y remaining unchanged the partial derivative with respect to y changes.

This may seem at first counter intuitive, but is correct. Let us prove it.

Note hat we can invert the change of coordinate for x very simply

\(x = \bar{x} {-}\bar{y} \),

using the fact that y does not change. Then one needs to use the chain rule,

\(\frac{\partial}{\partial \bar{x}}  = \frac{\partial x}{\partial \bar{x}}\frac{\partial}{\partial x}+ \frac{\partial y}{\partial \bar{x}}\frac{\partial}{\partial y}   =    \frac{\partial}{\partial x}\),

\(\frac{\partial}{\partial \bar{y}}  = \frac{\partial x}{\partial \bar{y}}\frac{\partial}{\partial x}+ \frac{\partial y}{\partial \bar{y}}\frac{\partial}{\partial y}   =    \frac{\partial}{\partial y} {-} \frac{\partial}{\partial x} \).

There we are. Despite our initial gut feeling that that the partial derivative wrt y should remain unchanged we see that it is in fact the partial derivative wrt x that is unchanged.  This can course some confusion the first time you see it,  and hence the nomenclature the fundamental misunderstanding of calculus.

I apologise for forgetting who first named the misunderstanding.

 

What is mathematical physics?

This is a question that naturally arises as I consider myself to be a mathematical physicist, so I do mathematical physics. But what is mathematical physics?

I don’t think there is any fully agreed on definition of mathematical physics and like any branch of mathematics and physics it evolves and grows. That said, there is roughly two common themes:

  • Doing physics like it is mathematics
  • That is trying to apply mathematical rigour in the constructions and calculations of physics. This is often very hard as physics often requires lots of simplifications and approximations. A lot of physical interpretation and intuition can enter into the work. Physics for the most part is not mathematics and lots of results in theoretical physics lack the rigour required by mathematicians.

  • Studying the mathematical structures required in physics and their generalisations
  • Mathematics is the framework in which one constructs physical theories of nature. As such mathematics, as mathematics is fundamental in developing our understanding of the world around us. This part mathematical physics is about studying the basic structures behind physics, often with little or no direct reference to a specific physical systems. This can lead to natural generalisations of the mathematical structures encountered and give a wider framework to understand physics.

    We see that mathematical physics is often closer to mathematics than physics. I see it as physically motivated mathematics , though this motivation is often very technical.

    Of course this overlaps to some extent with theoretical physics. However, the motivation for theoretical physics is to create and explore physical models, hopefully linking them with reality. Mathematical physics is more concerned with the mathematical structures. Both I think are important and feed of each other a lot. Without development in mathematical physics, theoretical physics would have less mathematical structure and without theoretical physics, mathematical physics would lack inspiration.

    What is geometry?

    This is a question I am not really sure how to answer. So I put it to Sir Michael Atiyah after his Frontiers talk in Cardiff. In essence he told me that geometry is any mathematics that you can imagine as pictures in your head.

    To me this is in fact a very satisfactory answer. Geometry a word that literally means “Earth Measurement” has developed far beyond its roots of measuring distances, examining solid shapes and the axioms of Euclid.

    Another definition of geometry would be the study of spaces. Then we are left with the question of what is a space?

    Classically, one thinks of spaces, say topological or vector spaces as sets of points with some other properties put upon them. The notion of a point seems deeply tied into the definition on a space.

    This is actually not the case. For example all the information of a topological space is contained in the continuous functions on that space. Similar statements hold differentiable manifolds for example. Everything here is encoded in the smooth functions on a manifold.

    This all started with the Gelfand representation theorem of C*-algebras, which states that “commutative C*-algebras are dual to locally compact Hausdorff spaces”. I won’t say anything about C*-algebras right now.

    In short one can instead of studying the space itself one can study the functions on that space. More than this, one can take the attitude that the functions define the space. In this way you can think of the points as being a derived notion and not a fundamental one.

    This then opens up the possibility of non-commutative geometries by thinking of non-commutative algebras as “if they were” the algebra of functions on some non-commutative space.

    Also, there are other constructions found in algebraic geometry that are not set-theoretical. Ringed spaces and schemes for example.

    So, back to the opening question. Geometry seems more like a way of thinking about problems and constructions in mathematics rather than a “stand-alone” topic. Though the way I would rather put it that all mathematics is really geometry!

    Should you beleive everything on the arXiv?

    For those of you who do not know, the arXiv is an online repository of reprints in physics, mathematics, nonlinear science, computer science, qualitative biology, qualitative finance and statistics. In essence it is a place that scientists can share their work and work in progress, but note that it is not peer reviewed. The arXiv is owned and operated by Cornell University and all submissions should be in line with their academic standards.

    So, can you believe everything on the arXiv?

    In my opinion overall the arXiv is contains good material and is a vital resource for scientists to call upon. Many new works can be made public this way, before being published in a scientific journal. Indeed, most of the published papers I have had call to use have versions on the arXiv. Moreover, the service is free and requires no subscription.

    However, there can be errors and mistakes in the preprints, both “editorial” but more importantly scientifically. Interestingly, overall the arXiv is not full of crackpot ideas despite it being quite open. There is a system of endorsement in place meaning that an established scientist should say that the first preprint you place on the arXiv is of general interest to the community. This stops the very eccentric quacks in their tracks.

    There has been some widely publicised examples of preprints on the arXiv that have cursed a stir within the scientific community. Two well-known examples include

    A. Garrett Lisi, An Exceptionally Simple Theory of Everything arXiv:0711.0770v1 [hep-th],

    and more recently

    V.G.Gurzadyan and R.Penrose, Concentric circles in WMAP data may provide evidence of violent pre-Big-Bang activity arXiv:1011.3706v1 [astro-ph.CO],

    both of which have received a lot of negative criticism. Neither has to date been published in a scientific journal.

    Minor errors and editing artefacts can be corrected in updated versions of the preprints. Should preprints on the arXiv be found to be in grave error, the author can withdraw the preprint.

    With that in mind, the arXiv can be a great place to generate feedback on your work. I have done this quite successfully in the past. This allowed me to get some useful comments and suggestion on work, errors and all.

    My advice is to view all papers and preprints with some scepticism, even full peer review can not rule out errors. Though, always be more confident with published papers and arXiv preprints that have gone under some revision. Note that generally people who place preprints on the arXiv are not trying to con or trick anyone, all errors will be genuine mistakes.

    Integration of odd variables III

    Abstract
    We will proceed to describe how changes of variables effects the integration measure for odd variables. We will do this via a simple example rather than in full generality.

    Integration measure with two odd variables
    Let us consider the integration with respect to two odd variables, \(\{ \theta, \overline{\theta} \}\). Let us consider a change in variables of the form

    \(\theta^{\prime} = a \theta + b \overline{\theta}\),
    \( \overline{\theta}^{\prime} = c \theta + d \overline{\theta}\),

    where a,b,c,d are real numbers (or complex if you wish).

    Now, one of the basic properties of integration is that it should not depend on how you parametrise things. In other worlds we get the same result whatever variables we chose to employ. For the example at hand we have

    \( \int D(\overline{\theta}^{\prime}, \theta^{\prime}) \theta^{\prime} \overline{\theta}^{\prime} = \int D(\overline{\theta}, \theta) \theta \overline{\theta}\).

    Thus, we have

    \(\int D(\overline{\theta}^{\prime}, \theta^{\prime}) (ad-bc)\theta \overline{\theta} = \int D(\overline{\theta}, \theta) \theta \overline{\theta}\).

    In order to be invariant we must have

    \(\int D(\overline{\theta}^{\prime}, \theta^{\prime})= \frac{1}{(ad-bc) }D(\overline{\theta}, \theta) \).

    Note that the factor (ad-bc) is the determinant of a 2×2 matrix. However, note that we divide by this factor and not multiply in the above law. This is a general feature of integration with respect to odd variables, one divides by the determinant of the transformation matrix rather than multiply. This generalises to non-linear transformations that mix even and odd coordinates on a supermanifold. This is the famous Berezinian. A detailed discussion is outside the remit of this introduction.

    Furthermore, note that the transformation law for the measure is really the same as the transformation law for derivatives. Thus, the Berezin measure is really a mixture of algebraic and differential ideas.

    What next?
    I think this should end our discussion of the elementary properties of analysis with odd variables. I hope it has been useful to someone!

    Integration of odd variables II

    Abstract
    We now proceed to define integration with respect to odd variables.

    The fundamental theorem of calculus for odd variables
    Let us consider just one odd variable. This will be sufficient for our purposes for now. Following the direct analogy with integration of functions over a circle the fundamental theorem of calculus states

    \(\int D\theta \frac{\partial f(\theta)}{\partial \theta} =0\).

    We use the notation \(D\theta \) for the measure rather than \(d\theta \) as the measure cannot be associated with a one-form. We will discuss this in more detail another time.

    Definition of integration
    Recall that the general form of a function in one odd variable is

    \(f(\theta) = a + \theta b\),

    with a and b being real numbers. Thus from the fundamental theorem we have

    \(\int D\theta b =0\).

    In particular this implies

    \(\int D\theta =0\).

    Then we have

    \(\int D\theta f(\theta) = a \int D\theta + b \int D\theta \:\: \theta = b \int D\theta\:\: \theta \).

    Thus to define integration all we have to do is define the normalisation

    \(\int D\theta\:\: \theta\).

    The choice made by Berezin was to set this to unity. Other choices are also just as valid. Thus,

    \(\int D\theta f(\theta) = b\).

    Integration for several odd variables
    For the case of more than one odd variable one simply uses

    \(\int D(\theta_{1}, \theta_{2} , \cdots \theta_{n})f(\theta) = \int D\theta_{1} \int D \theta_{2} \cdots \int D\theta_{n} f(\theta)\).

    example Consider two odd variables.

    \(\int D(\overline{\theta}, \theta) \left( f_{0} + \theta \:f + \overline{\theta}\: \overline{f} + \theta \overline{\theta}F \right) = F \).

    The general rule is that (taking care with signs) the integration with respect to the measure \(D(\theta_{1}, \theta_{2} , \cdots \theta_{n})\) of a function picks out the coefficient of the \(\theta_{1}, \theta_{2} , \cdots \theta_{n}\) term.

    Integration and differentiation are the same!
    From the above we see that differentiation with respect to an odd variable is the same as integration with respect to the odd variable. This explains why we cannot associate a “top-form” with the measure. This will become more apparent when we discuss changes of variables.

    What next?
    Next we will examine how changing variables in the integration effects the measure. We will see that things look “upside down” as compared with the integration of real variables. This is anticipated by the equivalence of integration and differentiation.

    Integration of odd variables I

    Abstract
    Before we consider odd variables, let us describe how to algebraically define integration of functions over the circle.

    Functions on the circle
    Recall the Fourier expansion. It is well known that any continuous function on the circle is of the form

    \(f(x) = \frac{a_{0}}{2} + \sum_{n=1}^{\infty}\left( a_{n} \cos(nx) + b_{n}\sin(nx) \right) \),

    with the a’s and b’s being constants, i.e. independent of the variable x.

    The fundamental theorem of calculus
    The fundamental theorem of calculus states that

    \(\int_{S^{1}} dx \: \frac{\partial f(x)}{\partial x } = 0 \),

    as functions on the circle are periodic.

    Integration of functions
    It turns out that integration of functions over the circle can be defined algebraically up to a choice in measure. To see this observe

    \(\int_{S^{1}} dx f(x) = \int_{S^{1}} dx \frac{a_{0}}{2} + \int_{S^{1}} dx \sum_{n=1}^{\infty}\left( a_{n} \cos(nx) + b_{n}\sin(nx) \right)\)

    Then we can write

    \(\int_{S^{1}} dx f(x) = \frac{a_{0}}{2} \int_{S^{1}} dx + \int_{S_{1}} dx \frac{\partial }{\partial x} \sum_{n=1}^{\infty} \left ( \frac{a_{n}}{n}\sin(nx) + \frac{- b_{n}}{n} \cos(nx) \right)\)

    to get via the fundamental theorem of calculus

    \(\int_{S^{1}} dx f(x) = \frac{a_{0}}{2} \int_{S^{1}} dx\).

    So we have just about defined integration completely algebraically from the fundamental theorem of calculus. All we have to do is specify the normalisation

    \(\int_{S^{1}} dx \).

    The standard choice would be

    \(\int_{S^{1}} dx = 2 \pi\),

    to get back to our usual notion of integration of periodic functions. Though it would be quite consistent to consider some other normalisation, say to unity.

    Anyway, up to a normalisation the integration of functions over the circle selects the “constant term” of the corresponding Fourier expansion.

    What next?
    So, the above construction demonstrates that integration of functions over a domain without boundaries can be defined algebraically, up to a normalisation. This served as the basis for Berezin who defined the notion of integration of odd variables.

    Recall that odd variables have no topology and no boundaries. The integration with respect to such variables cannot be in the sense of Riemann. However, thinking of functions of odd variables in analogy to periodic functions integration can be defined algebraically. We will describe this next time.

    Differential calculus of odd variables.

    Abstract
    Here we will define the notion of differentiation with respect to an odd variable and examine some basic properties.

    Definition
    Differentiation with respect to an odd variable is completely and uniquely defined via the following rules:

    1. \(\frac{ \partial \theta^{\beta} }{\partial \theta^{\alpha}} = \delta_{\alpha}^{\beta} \).
    2. Linearity:
      \(\frac{\partial}{\partial \theta }(a f(\theta)) = a \frac{\partial}{\partial \theta } f(\theta)\).
      \(\frac{\partial}{\partial \theta }( f(\theta) + g(\theta)) = \frac{\partial}{\partial \theta }f(\theta) + \frac{\partial}{\partial \theta } g(\theta)\).
    3. Leibniz rule:
      \(\frac{\partial}{\partial \theta }( f(\theta)g(\theta)) = \frac{\partial f(\theta)}{\partial \theta } + (-1)^{\widetilde{f}} f \frac{\partial g(\theta)}{\partial \theta } \).

    The operator \(\frac{\partial }{\partial \theta }\) is odd, that is it changes the parity of the function it acts on. This must be taken care of when applying Leibniz’s rule.

    Elementary properties
    It is easy to see that

    \(\frac{\partial}{\partial \theta^{\alpha}}\frac{\partial}{\partial \theta^{\beta}}+ \frac{\partial}{\partial \theta^{\beta}}\frac{\partial}{\partial \theta^{\alpha}}=0\),

    in particular

    \(\left( \frac{\partial}{\partial \theta} \right)^{2}=0\).

    Example
    \(\frac{\partial}{\partial \theta} (a + \theta b+ \overline{\theta}c + \theta \overline{\theta} d ) = b + \overline{\theta}d\).

    Example
    \(\frac{\partial}{\partial \overline{\theta}} (a + \theta b+ \overline{\theta}c + \theta \overline{\theta} d ) = c- \theta d\).

    Changes of variables
    Under changes of variable of the form \(\theta \rightarrow \theta^{\prime}\) the derivative transforms as standard

    \(\frac{\partial}{\partial \theta^{\prime}} = \frac{\partial\theta}{\partial \theta^{\prime}} \frac{\partial}{ \partial \theta}\).

    We will have a lot more to say about changes of variables (coordinates) another time.

    What next?
    We now know how to define and use the derivative with respect to an odd variable. Note that this was done algebraically with no mention of limits. As the functions in odd variables are polynomial the derivative was simple to define.

    Next we will take a look at integration with respect to an odd variable. We cannot think in terms of boundaries, limits or anything resembling the Riemann or Lebesgue notions of integration. Everything will need to be done algebraically.

    This will lead us to the Berezin integral which has the strange property that integration and differentiation with respect to an odd variable are the same.

    Elementary algebraic properties of superalgebras

    Abstract
    Here we will present the very basic ideas of Grassmann variables and polynomials over them.

    Grassmann algebra
    Consider a set of n odd variables \(\{ \theta^{1}, \theta^{2}, \cdots \theta^{n} \}\). By odd we will mean that they satisfy

    \( \theta^{a}\theta^{b} + \theta^{b} \theta^{a}=0\).

    Note that in particular this means \(\theta^{2}=0\). That is the generators are nilpotent.

    The Grassmann algebra is then defined as the polynomial algebra in these variables. Thus a general function in odd variables is

    \(f(\theta) = f_{0} + \theta^{a}f_{a} + \frac{1}{2!} \theta^{a} \theta^{b}f_{ba} + \cdots + \frac{1}{n!} \theta^{a_{1}} \cdots \theta^{a_{n}}f_{a_{n}\cdots a_{1}}\).

    The coefficients we take as real and antisymmetric. Note that the nilpotent property of the odd variables means that the Grassmann algebra is complete as polynomials.

    Example If we have the algebra generated by a single odd variable \(\theta \) then polynomials are of the form

    \(a + \theta b\).

    Example If we have two odd variables \(\theta\) and \(\overline{\theta}\) then polynomials are of the form

    \(a + \theta b + \overline{\theta} c + \theta \overline{\theta} d\).

    It is quite clear that the polynomials in odd variables forms a vector space. You can add such functions and multiply by a real number and the result remains a polynomial. It is also straightforward to see that we have an algebra. One can multiply two such functions together and get another.

    The space of all such functions has a natural \(\mathbb{Z}_{2}\)-grading, which we will call parity given by the number of odd generators in each function mod 2. If the function has an even/odd number of odd variables then the function is even/odd. We will denote the parity by of a function \(\widetilde{f}= 0/1\), if it is even/odd.

    Example \(a +\theta \overline{\theta} d \) is an even function and \(\theta b + \overline{\theta} c \) is an odd function.

    Let us define the (super)commutator of such functions as

    \([f,g] = fg -(-1)^{\widetilde{f} \widetilde{g}} gf\).

    If the functions are not homogeneous, that is even or odd the commutator is extended via linearity. We see that the commutator of any two functions in odd variables vanishes. Thus we say that the algebra of functions in odd variables forms a supercommutative algebra.

    Specifically note that this means the ordering of odd functions is important.

    Superspaces
    The modern approach to geometry is to define and deal with “spaces” in terms of the functions upon them. Geometrically we can think of the algebra generated by n odd variables as defining the space \(\mathbb{R}^{0|n}\). Note that no such “space” in the classical sense exists. In fact such spaces consist of only one point!

    If we promote the coefficients in the polynomials to be functions of m real variables then we have the space \(\mathbb{R}^{m|n}\). We are now most of the way to defining supermanifolds, but this would be a digression from the current issues.

    Noncommutative superalgebras
    Of course superalgebras for which the commutator generally is non-vanishing can be defined and are naturally occurring. We will encounter such things when dealing with first order differential operators acting on functions in odd variables. Geometrically these are the vector fields. Recall that the Lie bracket between vector fields over a manifold is in general non-vanishing.

    What next?
    Given the basic algebraic properties of functions in odd variables we will proceed to algebraically define how to differentiate with respect to odd variables.