A Visual Introduction to the Zernike Polynomials

About
Share

Published On Dec 7, 2020

The Zernike polynomials are very useful functions with broad applications. These polynomials have the special property that they form an orthonormal basis in the space of square-integrable functions on a circular domain. That sounds esoteric, but the concept is simple and beautiful, once unpacked into English:

First, only use Zernike polynomials if you are working with functions defined on a circular domain, and if you are working with functions defined on a circular domain, always think about how Zernike polynomials can be used to provide insight and to make your life easier. Functions on circles come up all the time in academia and industry, e.g. in testing optical devices, or analyzing metrology data for wafers in semiconductor manufacturing. '

Second, you have to be familiar with the concept of taking a dot product of two functions. Regardless of if you ever deal with circles, you should know this concept. To take the dot product of two functions, just multiply the two functions together, then integrate over the entire domain that the two functions are defined on. This is not only analogous to the vector dot product, but actually is the vector dot product (in which the elements of two vectors are pairwise multiplied and summed) taken to the continuous limit. When taking the dot product of two functions, the resulting scalar value relates to the degree to which those functions "point in the same direction" in function space. A dot product of zero means the two functions are totally perpendicular to each other, or "orthogonal" in math language. A dot product equal to the product of the amplitude of the two functions, which is the maximal value, means that the two functions are exactly parallel (and the negative of this amplitude means that they are antiparallel). Intermediate values imply that the functions are at an angle to one another, with the value of the dot product relating to the cosine of the angle between them. One more thing: if the dot product of a function and itself is one, then that function is said to be "normalized", meaning it has a "length" of 1 in function space. This concept of a function space is very real, and if it seems abstract, it is only because we organic beings are dimensionally impoverished.

Now to the orthonormality of the Zernike polynomials: any pair of two different Zernike polynomials will have a dot product of zero, meaning that these functions are all orthogonal (perpendicular) to each other in function space. Moreover, the dot product of any Zernike polynomial and itself equals 1, meaning that these functions all have unit length in function space. Therefore, they form a nice clean basis that spans the space of all square-integrable functions (not totally insane functions) on a circle. These functions are like the (x,y,z) vectors that can be used to span a 3D space.

The main point is that any function on a circle, at least any normal function that you will ever meet in practice, can be represented as a list of coefficients: how much of each Zernike function is in this function? This list of coefficients provides insight as to the average value of the function, its tilt along two axes, the amount of focal variation, etc. for each of the basis functions. Plus, most functions which aren't super high frequency can be accurately represented as a small handful of numbers, allowing for tremendous compression of the structure of the function.

To calculate the Zernike coefficients for a given function, just take the dot product between that function and each of the Zernike basis functions. This is, in essence, no different than representing a vector in 3 spatial dimensions as a weighted sum of x, y, and z vectors.

Examples of how Zernike functions can be used:

Shine a beam of columnated light through a lens, or system of lenses, onto a photodetector. You'll get an intensity profile, where the intensity or brightness of the light will vary on a circular region. Calculate the Zernike components of that to see how well your lenses are aligned, what the focus is like, if there's any spherical aberration, if the lenses are contaminated which might cause high-frequency noise, etc.

Or, say you're measuring some characteristic of MEMS devices on a wafer. It's often insightful to cast that metric as a function on the wafer, since tiny variations in wafer thickness or planarity often affect device characteristics. Find the Zernike coefficients of all these functions. Now you have very small sets of numbers, from which you can reconstruct the general form of the measurement variation across the wafer. You can also use these numbers for SPC charts and such. And at the end of the day a memory cost of a kB per wafer or less can give you a ton of insight.

Questions? Send me a message or contact me in person if you know me. I'm always happy to talk about the beautiful Zernike polynomials! Look them up on Wikipedia to see how to make them. Looks complicated but it's not.

Published On Dec 7, 2020

Share/Embed

Video Link