Cats2D Multiphysics > Overview > Developments

## Developments

These posts are mostly related to new features, and old features made better, but occasionally I will share my thoughts on how best to use features, not just as they pertain to Cats2D but to computational fluid dynamics in general. And once in awhile we will check in on Goodwin to see what he has been up to lately.

*All unpublished results shown here are *Copyright © 2016–2019 Andrew Yeckel, all rights reserved

### Kobayashi Maru

Please read this important notice about the ownership of Cats2D, and about plagiarism committed against me. I deserve credit for my academic scholarship.

### Mondrian with a twist

A physics-based calculation was used to make this picture.

### Visit my research topics page

The "Developments" page you are now reading, which is essentially my blog, has grown enormously long. It chronicles my work of the past four years without any attempt to organize it. To make it easier to use my research, I have collated related material from this page and elsewhere on the web site into articles that are found at my research topics page. Some articles emphasize crystal growth. Others emphasize visualization or numerics. I've grouped them together in a way that makes sense to me. Here are some of the articles.:

Submerged insulator/Rotating baffle vertical Bridgman method (SIVB/RBVB)

Unstable thermal convection in vertical Bridgman growth

Computing mass transport in liquids

I will add new articles to the research topics page over time. I will also continue to blog here on the developments page, in its usual stream of consciousness style.

### The three bears

I have put together a short discussion of a situation I have encountered several times when time integrating a system under batch operation. Most recently I encountered it when studying unstable thermal convection in vertical Bridgman growth. Previously I saw similar behavior studying detached vertical Bridgman growth (DVB) and temperature gradient zone melting (TGZM). In DVB the capillary meniscus bridging the crystal to the ampoule wall becomes unstable. In TGZM the shape of the solid-liquid interface becomes unstable. These systems have in common a slow time scale for batch operation, and a fast time scale for the local instability.

The wide separation of these time scales is problematic. In each of these cases it is possible to outrun the instability using a time step small enough to capture the batch dynamics accurately, but too large to resolve the faster time scale of the instability. It is quite easy to miss the instability altogether, even when using tight error control during integration. Nonlinear analysis is endlessly surprising, which is both rewarding and frustrating. For these kinds of problems it can be a dangerous walk in the woods.

### Gilding the lily

Flow visualization is often facilitated by adding light reflecting particles to a liquid, such as aluminum dust or tiny glass beads, or by injecting dye into a liquid. Many beautiful photographs obtained this way have been published in the science literature. In 1982 Milton Van Dyke published An Album of Fluid Motion, a uniquely interesting collection of several hundred photographs contributed by experimentalists from around the world.

If someone were to follow in Van Dyke's footsteps and collect images for a new album of fluid motion, this one based on computer-generated images of theoretical flow calculations, I doubt that nearly so many worthy images could be assembled. It's a shame given the enormous aesthetic potential of the subject matter.

### Computation XI with Red, Blue and Yellow

See some of my other works based on the paintings of Piet Mondrian.

### Vetrate di Chiesa

Metis partitioning of a mesh for nested dissection of the Jacobian matrix in Cats2D:

Preaching from the pulpit again, here are my thoughts on how to rank the following activities in terms of their importance to CFD:

Numerical methods > Algorithms > Serial optimization > Parallelization

Consider Gaussian elimination. Factorizing a dense matrix requires O(N^{3}) operations. Tricks can be applied here and there to reduce the multiplier, but there are no obvious ways to reduce the exponent (there are some unobvious ways, but these are complicated and not necessarily useful). This scaling is brutal at large problem sizes. But we need Gaussian elimination, which affects everything else downstream.

To deal with this we build bigger, faster computers. When I started it was vector supercomputers. Later came parallel supercomputers. Now we have machines built from GPUs. Using any or all of these approaches to mitigate the pain of the N^{3} scaling of Gaussian elimination is futile. Increasing N by a factor of ten means increasing the operations by 1000. So parallelizing our problem across 1000 processors gains us a factor of ten in problem size if the parallelization is ideal, and a factor of five at best under actual conditions. It isn't much return for the effort.

Of course no one working in CFD factorizes a dense Jacobian. Sparse matrix methods are used instead. A simple band solver reduces the operation scaling to O(N), for example. A multifrontal solver does not scale so simply, but scales much better than O(N^{2}) on the problems solved by Cats2D. Popular Jacobian-free alternatives also generally scale better than O(N^{2}). By choosing an algorithm that takes advantage of special characteristics of the problem it is possible to reduce operations by a factor of N or more, where N is typically greater than 10^{5} and often much larger than that. Reducing the exponent on N is where the big gains are to be found and this can only be done at the algorithm level.

This is a simple example, but I have encountered this theme many times. The first working version of an algorithm can be sped up by a factor of ten, and sometimes much, much more. I once rewrote someone else's "parallel" code as a serial code. When I was done my serial version was 10,000 times faster than the parallel version. There is no way hardware improvements can compensate for a poorly chosen or badly implemented algorithm. Far too often I've heard parallel computing offered as the solution to what ails CFD. It's totally misguided.

Now let's look more closely at the progressive improvements Goodwin made to his solver these past few years. His starting point, what he calls "Basic Frontal Solver" in the graph, was written in 1992. Before discussing the performance gains in this graph, I want to emphasize that this "basic" solver is not slow. It has many useful optimizations. In this early form it was already faster than some widely used frontal solvers of its day, including the one we were using in Scriven's group.

Still back in 1992 Goodwin added two major improvements, static condensation ("SC") and simultaneous elimination of equations ("Four Pivots") by multirank outer products using BLAS functions. These changes tripled the speed of the solver, making it much faster than its competitors of the day. Note that these improvements are entirely algorithmic and serial in nature, without multithreading or other parallelism. For the next 25 years we sailed along with this powerful tool, which, remarkably, remains competitive with modern versions of UMFPACK and SuperLU.

At the end of 2014 Goodwin climbed back aboard and increased solver speed by a factor of six using nested dissection to carry out the assembly and elimination of the equations. Again this was done by improving the algorithm, rather than by parallelizing it. Using multithreading with BLAS functions to eliminate many equations simultaneously contributed another factor of three, a worthy gain, but small in comparison to the cumulative speedup of the solver by serial algorithmic improvements, which is a factor of twenty compared to the basic solver.

I've simplified things a bit because these speedups are specific to the problem used to benchmark the code. In particular they fail to consider the scaling of factorization time with problem size. This next plot shows that Goodwin's improvements have changed the scaling approximately from O(N^{1.5}) to O(N^{1.2}), primarily attributable to nested dissection. The payoff grows with problem size, the benefit of which can hardly be overstated. Not to be overlooked, he also reduced the memory requirement by a factor of ten, allowing much larger problems to fit into memory.

One million unknowns in less than ten seconds on a rather pathetic 2013 MacBook Air (two cores, 1.3 GHz, 4 Gb memory). This is just good programming. Big shots at supercomputer centers think they've hit the ball out of the park if they can get a factor of sixty speedup by parallelizing their code and running it on 256 processors. It's the Dunning-Kruger effect on steroids.

Parallelizing a poorly optimized algorithm is a waste of time. Optimizing a badly conceived algorithm is a waste of time. Applying a well conceived algorithm to an inappropriate numerical method is a waste of time. Failure to understand these things has led to a colossal waste of computing and human resources.

### Computation with Red, Blue and Yellow

Untitled. First in a series based on the paintings of Piet Mondrian.

### Sweet dreams and flying machines

An excerpt from my first publication ever, written with Stan Middleman. It was mostly his paper, but I ran some simulations using an early version of Fidap on a Cray X-MP at the brand new San Diego Supercomputer Center back in 1985. Look at those ugly streamline contours.

This was the last time I ever used a commercial CFD code. I don't miss it. Scroll down to my post "Eating the seed corn" to learn more about what has happened to CFD since then.

### Nobody does it better

Goodwin makes art, too. Out of respect for Mondrian, he restricts himself to the secondary colors.

These images are taken from his slide presentation on the frontal solver. I cried when I skimmed it, forever grateful that I didn't have to sit through the talk itself. I'm starting to worry about my dependence on his solver, however, and I'm wondering if it would be possible to have him cryogenically preserved in case I outlive him.

### Time for a fika break

Way down this page is a post titled "Lost in the hours" that compares a standard convective-diffusion model to a particle convection model computed by pathline integration. By massively seeding the flow with non-interacting particles we are able to represent mass transport in the singular limit of zero diffusion. I explore this approach here on the problem I studied in my recent post "Drifting petals" on the mixing of two compositional layers of a liquid by motion of a plunger in a cylinder approximately 15 ml in volume.

Let's start by looking at some still frames from first downstroke of the plunger. On the left is a frame from the convective-diffusion model. In the middle is a frame from the particle convection model. Red and blue particles turn into an swarming mess of purple in these visualizations, so on the right I have tried the Swedish flag colors for better contrast. I think I prefer it for the particle visualizations.

The agreement is remarkable. The particle convection model reproduces many fine details of the convective-diffusion model. This lends faith in both models, which are drastically different from one another, both in their mathematical formulations and in their algorithmic implementations. There are some issues with this comparison that are not obvious, but I will defer their discussion to the end of this post.

Mouse over the images below to see more broadly how the two models compare. After two strokes of the plunger the unmixed length scale has been reduced to the point that diffusion becomes important and the models substantially deviate from one another. In the early stages, however, when most of the mixing occurs, the particle convection model gives a very informative view of mass transport.

See both animations run simultaneously in a separate window (this might take a few moments to load).

Diffusion has largely homogenized the two layers after five piston strokes (below left). Without diffusion the particles are fairly well mixed, but not perfectly so (below right).

It should be noted that the diffusion coefficient used here, D = 10^{-8} m^{2}/s, is three to ten times larger than typical of small molecules. A smaller more realistic value is very expensive and difficult to simulate because of the sharp concentration gradients found throughout the domain. These gradients place tight restrictions on time step and element size. In Cats2D all equations are solved simultaneously on the same mesh (no sub-gridding), so these restrictions carry over to the flow equations.

If conditions are such that a particle convection model can be used to study convective mixing, it is necessary to solve only the flow equations. A factor of three reduction in CPU time is obtained by eliminating the species transport equation and its restrictions. There are fewer unknowns, a larger time step can be used, and the Jacobian is better conditioned which translates to a faster elimination by the frontal solver. The added effort needed to integrate the particle pathlines is modest. I was able to reduce the total number of elements by a factor of four without compromising the solution, which gained an additional factor of five speedup. Overall the CPU time needed to simulate one period of plunger motion was reduced from 3 hours to 12 minutes with these changes.

Now about those issues I alluded to earlier. The pathline integrations shown here have been seeded with approximately 20,000 fictitious particles uniformly spaced in the liquid. The particles represent mathematical points, but we must draw dots else we cannot see them. These dots will overlap in the drawing where two particles have approached close to one another. Consider what happens when uniformly spaced particles are subjected to an extensional flow. Spacing between the particles stretches in one direction and compresses in the other direction.

Empty space is created in the drawing when the dots overlap. The total areas of dots and empty space are not conserved, something I brought up earlier in the "Paint the sky with stars" entry half way down this page. This explains in part the appearance of white space in the particle visualizations when none was apparent in the initial condition. This is particularly obvious underneath the plunger at the start of the simulation where the flow is strongly extensional along the center axis.

However, there is more going on here than meets the eye. The equations represent conservation in a cylindrical geometry, but are visualized in a planar cross-section on the page. Areas near the center axis represent comparatively little volume. Quite a bit of plunger has been removed at the top of the stroke in terms of its planar cross-section, but most of it comes from the narrow shaft, so the volume removed is not so great as it appears. The liquid level falls only slightly to compensate for this small volume, far less than the area created underneath the plunger in the drawing.

As a consequence the total area of the liquid displayed in the drawing is larger at the top of the stroke than at the bottom, even though its volume remains constant in the problem formulation. For this reason alone empty space will increase when the plunger is withdrawn. But there is another important implication regarding the particles themselves. If we regard them as objects in the cylindrical geometry, they must represent toroids rather than spheres. To maintain a fixed volume, the cross-sectional area in the plane must increase when a particle moves towards the center axis. The concept is illustrated below using bagels.

Obviously we aren't doing this with our dots of constant size. When particles rush under the plunger as it is withdrawn they fail to cover all the new area created there because they don't grow in size to satisfy conservation. This explains the growth and decline of empty space along the center axis as the plunger is raised up and down in the visualizations.

This isn't a problem per se, we just need to be aware of it and avoid drawing unwarranted conclusions about conservation of mass from these artifacts of the visualization. The only thing conserved is the number of particles, which have no volume. Their purpose is to provide a pointwise representation of the concentration field and they do so accurately at modest computational effort.

### Working on a mystery

These short movies are from my research in the late 1990s. They show ACRT (accelerated crucible rotation technique) applied to two different vertical Bridgman systems.

Why are they so different? The system on the right experiences hardly any mixing at all, even though its rotation rate is three times larger than the system on the left (30 rpm versus 10 rpm). But it is also much smaller (1.5 cm versus 10 cm inner diameter) and its flow is much less inertial. These results are reminiscent of the animations shown in the "Drifting petals" entry below where inertia, or lack of it, has a big impact on the nature and extent of mixing.

I will be revisiting these problems soon for an in-depth analysis that only Cats2D can deliver.

### Drifting petals

I have been working on a series of segregation studies in vertical Bridgman growth and some of its variants. One area of interest is the zone leveling effect of a submerged baffle. Zone leveling is a term sometimes applied to processes that tend to redistribute chemical composition uniformly in the growth direction of the crystal. Diffusion-limited growth has a zone leveling effect if convective mixing can be avoided. A submerged baffle isolates a small confined area near the growth interface from convection in the bulk, so we expect some zone leveling to occur compared to a conventional VB system. This is indeed the case. More on this later.

Modeling segregation in melt crystal growth is difficult. Almost no one can do it well. Mass transfer is inherently time-dependent. Diffusion in liquids is always slow, so the equations are stiff in time and space, placing strict demands on element size and time step size. Failure to observe these demands often ends in numerical instability.

There are other challenges. In a moving boundary problem the domain itself is in motion, which raises important issues in how the differential equations and their discretization are cast. Cats2D uses an arbitrary Lagrangian-Eulerian (ALE) form of the equations that conserves mass and momentum accurately. Conservation boundary conditions also must be formulated carefully. Gross failure to conserve mass is likely to occur if these things are not done precisely.

In the course of this work I have run some tests on mass transport in an isothermal system in which flow is generated by a pumping motion of the baffle. The surface of the liquid rises and falls as the baffle is raised and lowered. The mesh elements expand and contract as well, so the computational domain is in motion relative to the laboratory reference frame. Many things can go wrong in a simulation like this, making it an excellent test of the code. Primarily I ran these tests because I am obsessive about validating the code, but the results are sufficiently interesting to show in their own right.

Below I show mixing of two layers of liquid initially having different concentrations of a fast diffusing molecule. The inner diameter is 2 cm and the total amount of liquid is 13.2 milliliters. The viscosity is 0.1 Pa-s, approximately that of a vegetable oil at room temperature. The diffusion coefficient is 10^{-8} m^{2}/s, two to three times faster than typical for small molecules. I used a high value to soften the numerics a bit. Even then the Peclet number equals 10000, indicating that mass transport is strongly convected under these conditions. In one case I've made the layers equal in volume, in the other case I've restricted the lower layer to the region under the baffle. The rate of motion in the animations is depicted in real time.

Mixing seems unusually slow. The Reynolds number is unity, near the Stokes limit at which flow is reversible, so fluid tends to return to a place near where it started when the baffle returns to its initial position. The interface between areas of high and low concentration is stretched by the flow, which enhances mass transfer, but diffusion across this boundary is slow and there are no large scale motions to carry mass across it. The fluid under the baffle has hardly been mixed after five up and down strokes of the baffle.

In the next results I have used the viscosity of water, 0.001 Pa-s, which increases the Reynolds number to 100. The added inertia makes the flow highly irreversible. Wakes churn around the baffle with a salutary effect on mixing. The improvement is dramatic. I've used the same diffusion coefficient as before, so the improvement is entirely attributable to convective mixing. The boundary between areas of high and low concentration stretches rapidly and spirals around itself, quickly reducing the mixing length to the diffusion length. Two strokes of the baffle are enough to mix the layers.

Conservation of mass is very accurate with total mass loss or gain less than 0.1% over the entire time integration. To achieve this I used a fine discretization with 332,391 unknowns. Variable time stepping with error control required approximately 1000 time steps per period at this Reynolds number. The wall clock time per time step on my modest laptop computer was about 10 seconds, so these animations took about 15 hours each to compute and draw the frames.

These computations allow very little compromise in numerics. A fine grid is required throughout the entire domain to capture the many sharp internal layers created by the swirling flow. Very small time steps and a high quality second-order time integrator are needed to conserve mass accurately (BDF2 was used in these computations). Numerical instability is difficult to avoid unless active time step error control is used. A robust linear equation solver is needed, and you'll be waiting around for a long time if it isn't very fast.

In a coming post I will discuss an alternative strategy for computing mixing at very high Peclet numbers that is more stable and less expensive than the convective-diffusion approach used here. Aesthetically pleasing images can be expected, as usual.

### Things that really bug me

Recently I came across the following sentence in an article at a major news site: "Powerful legs help the cassowary run up to 31 miles per hour through the dense forest underbrush."

I think we can safely assume that no one has accumulated data of sufficient quantity and accuracy to confidently assert that cassowaries reach a top speed of precisely 31, rather than say 30 or 32, miles per hour. The phrasing "up to" suggests that 31 is a ballpark figure. Then why the extra digit of precision? Is it significant?

Undoubtedly this number appeared in the original source as 50 kilometers per hour, which is a round number. I take this number to mean more than 40 and less than 60. This converts to a range of plus or minus six miles per hour. But American news sources generally convert metric quantities to our customary units, and present the result to a greater precision than implied by the original source.

I sense that this practice creates a cognitive bias in how we assess the reliability of a quantitative result. A rough estimate, perhaps of dubious reliability, is converted to a more precise value, suggesting that it was obtained by comparably precise measurement. In the case of the cassowary, the intent of the original source would have been more closely observed by rounding 31 to 30 miles per hour after the conversion.

Another such number is 3.9 inches, which comes up more often than you might think. What does that equal in metric? A very round 10 centimeters. I once watched a television documentary which claimed that satellites are capable of detecting a moving bulge on the surface of the ocean "as small as 3.9 inches," which might be used to detect enemy submarines, for example. Expressing it this way conjures a hand held ruler where we can easily distinguish 3.7 inches (it's just noise) from 4.2 inches (it's a submarine). In this perspective, a signal of 6 inches sounds strong.

Suppose we say "as small as 0.01 meters" instead of 3.9 inches, and for good measure add that a satellite flies at an orbit of 10^{5} meters or higher. A good engineer will note that the claimed precision of the measurement is 10^{-7}, or one ten-millionth, of the distance from the satellite to the measurement, and will understand that this is a difficult measurement against a noisy background. From this perspective a signal of 0.02 meters with a precision of 0.01 meters does not sound nearly as convincing as a signal of 7.8 inches with a precision of 3.9 inches.

### Stir it up

Mixing of a stratified liquid by a periodic pair of Lamb-Oseen vortexes. Mouse over image to make it go (it might take a few moments to load).

Periodic boundary conditions on the temperature field don't make much physical sense, but I was having a lazy day.

### Jumping the shark

I just came across this line in an abstract from a recent meeting on crystal growth modeling:

*Finally, we discuss plans to deploy of our model as a user-friendly software package for experimental practitioners to aid in growth of novel materials. *

It is a vaporware announcement, and I am its target. How awesome is that?

In my experience—and I daresay it is considerable—developing a "user-friendly software package" in physics-based computing is difficult, particularly for nonlinear multiphysics problems with free boundaries. I'm not even sure it is possible, the wonderful Cats2D notwithstanding.

Why would anyone who lacks a strong background in software development believe they could accomplish this using inexperienced, itinerant labor? How about a person who has never even *used,* much less written, such software?

### Cats2D art machine

Instructions: Choose one flow type, one scalar field type, and one particle path type. Go from there.

To make these I chose Hill's spherical vortex, principal angle of extension, and streaklines.

Tee shirts. Mouse pads. Maybe even a small arts grant. The possibilities boggle the imagination.

### Cats2D onsite training

### Rollin' and Tumblin'

A small but non-infinitesimal volume of fluid can experience shear and extensional deformation in addition to linear translation and rigid rotation. A useful discussion is found in Sec. 1.3.7 of these notes on kinematic decomposition of motion, and here too, particularly Sec. 3.6 on principal strain axes. Viewed this way, ordinary Couette flow can be characterized as a superposition of linear translation, rigid rotation, and pure extension.

Below I show streaklines of particles in Couette flow. Vorticity is constant and we can see that particles released at the same time have all rotated the same amount regardless of velocity or distance traveled. The rotation rate is one half the vorticity in radians per time unit. After Pi time units the particles furthest to the right have rotated a quarter turn (Pi/2).

Shown next is Poiseuille flow in planar coordinates, which has a linear distribution of vorticity. Particles rotate clockwise in the lower half of the channel where vorticity is negative, and counter-clockwise in the upper half where vorticity is positive. Rotation is fastest near the walls and drops to zero at the center line, plainly evident from the absence of particle rotation there.

These plots do a good job illustrating the translating, rolling motion of particles in simple shear flow. But a non-infinitesimal volume of fluid also experiences deformation in shear flow because one side of the particle is moving faster than the other. Particles are stretched at a 45 degree angle to the streamlines, illustrated in the sketch below. This special direction is a principal axis of stress, an axis along which flow is purely extensional.

In mathematical terms, the principal directions of a flow are given by the eigenvectors of the deviatoric stress tensor. For Newtonian and other isotropic fluids these axes are orthogonal to one another. The eigenvalues give the rate of expansion or compression along the principal axes. The trace of the tensor is zero for incompressible flow in Cartesian coordinates; hence, in planar two-dimensional problems, compression in one principal direction is always equal in magnitude to extension in the other direction. The trace is not zero in cylindrical coordinates because flow in the radial direction expands at a rate -2V/R because the coordinate system expands in that direction.

I have added a new vector plot type to Cats2D that shows the principal axis of extension in a flow scaled by the extension rate. Here is what it looks like for Poiseuille flow, where we see the same 45 degree angle everywhere, largest near the wall and zero at the center line.

Next I show the principal axis of extension in Hill's spherical vortex plotted on top of stream function contours. Along the vertical center line we can see that the principal axis forms a 45 degree angle with the streamlines, indicating shear flow. Near the horizontal center line, the angle is either close to zero or close to 90 degrees, indicating extension or compression. The largest rate of extension happens at the rear of the vortex (the left side), and the largest rate of compression occurs at the front of the vortex (the right side).

Elsewhere in the vortex the angle formed between the direction of flow and the principal axis of extension is intermediate between extensional and shear flow. I have added a new contour plot type to Cats2D that shows this quantity, the principal angle. In pure shear flow this angle equals 45 degrees, in pure extension it equals zero degrees, and in pure compression it equals 90 degrees. Shown below is the principal angle in Hill's spherical vortex, which has large areas of extension (blue) and compression (red) in the direction of flow. The angle is discontinuous at the center of the sphere, and it is singular at the vortex center, features which are awkward to depict using contours, so I've omitted contours in the upper half of the sphere. I think the continuous color image is more attractive without them.

In this animation we can see the line of particles extending in the blue regions and compressing in the red regions as it circles the sphere.

I am having a lot of fun playing around with these new types of plots, and I am rediscovering basic concepts in fluid dynamics at the same time. Cats2D would make a great teaching tool.

### Whine and cheese

Let's talk about things that annoy programmers. Three in particular stand out: programming is way harder than it looks, the boss doesn't know how to code, and unclear project specifications are a nightmare. Obviously these are closely related phenomena.

Programming isn't a binary skill, but I sense that many people perceive it that way. A programmer seems like a plumber or a nurse or maybe even a public notary. Either you are one, or you aren't one. At some point knowledge of plumbing saturates and there isn't much in the world of plumbing to learn anymore. Programming ability doesn't saturate this way. It takes years to grasp the basics, and for those who aspire to mastery, the learning never stops. The range of ability, experience, and talent among self-identified programmers is enormous.

Programming is slow, laborious work that demands constant deliberation and care. Bad things happen when it is rushed. Not everything needs to be perfect, but cutting corners in the wrong places will cripple a code. Acquiring the skills to do this well in a large complicated code takes much longer than many people imagine. It is often difficult to successfully extend the capabilities of a large code, and it is always easy to break it trying.

I once knew a professor who boasted his student was going to write a code to solve the Navier-Stokes equations using the finite element method, and do this in two weeks according to a time line they had written down together. An existing linear equation solver would be used, but otherwise the code would be written from scratch in Fortran 77 (this was in the early 1990s).

The absurdity of this cannot be overstated. Even Goodwin could not do this in fewer than three months, and he's a freak. Six months would be fast for a smart, hard working graduate student to do it. Being asked to accomplish something in two weeks when it really takes six months is confusing and demoralizing.

Unfortunately the situation hasn't gotten any better in the intervening quarter century.

### March of the Penguins

Lisa is probably the only person ever to walk out during a showing of this movie. She didn't like the way the penguins were treated. If you've watched this movie you might remember how the penguins gang together in a large circular mass as protection against the wind, and how they cooperatively circulate to keep moving cold penguins off the exposed perimeter.

In Hill's spherical vortex, vorticity varies linearly from zero at the axial centerline to a maximum of 5 at the poles of the sphere, causing the ~~penguins~~ particles to spin as they circulate. Mouse over the image below to see the effect of vorticity as they march around my imaginary kingdom. The particles spin through 3/4 of a turn every time they orbit the vortex center, regardless of their initial position.

These visualizations show linear translation and rigid rotation of small particles. A small but non-infinitesimal volume of fluid can also experience shear and extensional deformation as well. These characteristics of a flow can have a strong impact on polymer solutions and other liquids with strain-dependent properties. My next entry on principal axes of stress in a flow will delve into this issue more deeply.

### Hill's spherical vortex

There is a wonderfully simple solution representing Stokes flow recirculating inside a sphere reported in 1884 by Micaiah John Muller Hill. In its stationary form, axial velocity (left to right in the picture below) is given by U = 1 - Z^{2} - 2 R^{2}, and radial velocity (upward in the picture) is given by V = ZR. Streamlines of the exact solution are shown in the top half of the sphere, and streamlines computed by the Cats2D are shown in the bottom half. They are indistinguishable.

The velocity components are shown next, U in the top half and V in the bottom half of the sphere. These are computed by the code and they look identical to the exact solution given by the formulas above. The error between exact and computed solutions is 0.02% on this discretization of 71,364 equations.

This solution is particularly useful for testing derived quantities such as vorticity, pressure, and shear stress. The computation of these quantities is a complicated task in the code, and errors might be introduced in the post-processor even though the code computes the solution correctly. The analytical form of the exact solution is so simple it is possible to derive these quantities by inspection. Pressure equals -10Z, for example, and vorticity equals 5R. Computed values of these quantities are shown below and they match the exact solutions perfectly.

Similar plots can be made for normal and shear stresses, all components of which are linear in either R or Z.

Next I show principal stresses computed by the code. Extension is shown in the bottom half of the sphere and compression is shown in the top half. These match perfectly their analytical forms derived from the exact solution, which are nonlinear in R and Z with simple closed forms as roots of a quadratic equation.

These might appear to be weak tests with the notion that errors are masked by the apparently simple nature of the solution. But the equations were solved on the mesh shown below, the elements of which do not align with the coordinate directions. The equations are transformed to a local coordinate system on each element, and solution variables are interpolated in that system. Pressures and velocity gradients are discontinuous at element boundaries and must be smoothed by a least squares projection onto a continuous basis set before contouring. A myriad of things could go wrong that would show up very clearly in these plots.

We never tire of testing at Cats2D and I was gratified to verify all manner of derived variables in the post-processor against this exact solution. But Hill's spherical vortex is also great problem for working through basic concepts in fluid dynamics. Later I will post some instructive animations that illustrate the effect of vorticity in this flow, and I will also show some new vector plots of principal directions, or axes, of extension and compression that I've developed recently.

Goodwin introduced me to this solution, the story of which is funny enough to share. First let me say that he has been acting rather agitated about the web site lately. New entries aren't posted often enough, the content is unoriginal, there is too much social commentary, and the ideas are stale and thin. Plus some unstated concerns about my state of mind. When I started writing this entry I realized it was going to be the second one in a row to mention him. Then it hit me: I had posted eleven consecutive entries without including his name once. I had mentioned my father, my childhood surfing buddy, and Bob Ross, but not him, despite his passing resemblance to Bob Ross. I hope this entry reassures him that he remains the star of my web site.

The humorous part is that he had to turn to me for help setting up the Cats2D input files to solve this problem. Though I don't show it here, there is also a flow computed outside the sphere, and he couldn't figure out how to specify the boundary conditions at the surface of the sphere to obtain a convergent solution. I immediately recognized that he was making exactly the same mistake I had warned against in my *Just sayin' *entry found down this page. This was an entry he had been particularly critical about, the irony of which cannot be overlooked. He said it was mundane, that it lacked eye candy, and for goodness' sake it didn't have a single clever turn of phrase (I swear that is exactly how he put it). So I sent him straight down the page to re-read that entry until he got things straight: *know your methods.*

### Circling the drain

Goodwin. He's an awesome pest sometimes (lately he has taken to calling himself the QA/QC department), but once in awhile he comes up with a useful suggestion. This time it was animated particle rotation. Of course I had to do all the work.

Mouse over the images to see the action when a line of particles is released into the Lamb-Oseen vortex pair in a periodic domain. Particles rotate at one-half the vorticity of the flow. Below left shows the full periodic cell, below right shows a close-up in slow motion.

Ideas like this are why it is handy to keep him around.

### Feed your head

Suppose you are an alien humanoid from the planet Ceti Alpha IV and have just arrived on planet Earth after a long voyage. Naturally you want to unwind before conquering the planet, so you'll need to determine which intoxicants are available. Your parents raised you to make good choices, so you'll be looking for something that has low potential for both dependence and accidental overdose. On your way to the White House to demand fealty of all Earthlings you check in with the District of Columbia Department of Health, which provides you with this "safety profile" (their wording) of commonly available terrestrial intoxicants:

It turns out that your physiology is remarkably similar to humans,* so this chart should be valid for you. The closer you get to the upper right hand corner of this plot, the hairier things get. It looks like you'll never escape the pit if you wander up there and start fooling around. But look at the lower left corner. That "LSD" stuff looks like you couldn't overdose on it if you tried, and it isn't habit forming, either. With a safety profile like that, LSD could become your everyday thing, and a big money maker back on the home planet, too. So you figure it will be cool to load up on some good acid before you hit the White House to make that fealty demand. But, for some reason, when you finally get around to the White House you forget why you came in the first place.

Obviously something is missing from this picture, since it fails to consider any issues beyond dependence and lethal dosage. We could add a third axis here labeled "consequential side effects" and I'm sure that LSD is pretty far out on this axis compared to caffeine, for example. The problem here, and it is a common one, is a failure to consider all the important criteria. Once this number exceeds two or three, we cannot easily portray them all on a flat sheet of paper. Moreover, we tend to focus on those criteria we can most easily quantify, rather than grappling with less tangible things that are nevertheless important. Every visual representation of a complex problem creates a cognitive bias in how we interpret the problem. We shouldn't settle for just one view of a situation. We should always ask what is missing from this picture.

* The reason behind this is explained in the 146th episode of Star Trek: The Next Generation, *The Chase*

### The highways of my life

Morning

Afternoon

I enjoy a visit to the Scripps Pier Cam once in awhile.

### The golden rule

Conserving mass is essential. My intuition and experience tell me it is far more important to conserve mass accurately than it is to conserve momentum when solving the Navier-Stokes equations. Mass conservation is the zeroth moment of the Boltzmann transport equation, whereas momentum conservation is the first moment. Compromising the zeroth moment of an approximation seems like a bad way to start.

Cats2D uses a nine-noded element with biquadratic velocity interpolation and linear discontinuous pressure interpolation (Q_{2}/P_{-1}). This is a magic element that satisfies the LBB stability condition, meaning it is free of spurious pressure modes. It always delivers a solution, which fits the Cats2D "use it and forget it" philosophy. Another advantage is that mass is conserved exactly in an integral sense over each element. This is a strong statement of mass conservation compared to other typical LBB stable elements which lack this property, such as the biquadratic-bilinear (Q_{2}/P_{1}) element.

Very good pathlines can be produced using the biquadratic-linear discontinuous element. I show here pathlines for a lid-driven cavity flow at Re = 100 on a uniform mesh of 100x100 elements, after 1 time unit on the left and 5 time units on the right. After 5 time units some of the pathlines near the middle of the vortex have circled around more than 100 times without wandering at all. Some spreading has occurred that is most evident in the lower part of the cavity, particularly in one pathline that passes close to the corners of the lid where the velocity field cannot be accurately resolved. But performance is impressive overall.

The same test repeated with the biquadratic-bilinear element is shown below. Performance is very good in the middle of the cavity, but much worse in the lower part of the cavity. The bilinear pressure basis has three times fewer continuity constraints than the linear discontinuous basis, resulting in a less accurate pressure field and large local divergence in some parts of the mesh. In these results 1% of the elements have divergence > 0.001 and 0.1% have divergence > 1. I'm dissatisfied with its performance here, though I never use this element type anyway, simply because a continuous pressure basis is awkward to use in many of the problems that I solve.

This makes me wonder about stabilized finite element methods such as pressure-stabilized Petrov-Galerkin (PSPG) and Galerkin-Least Squares (GLS). Cats2D does not implement these methods currently so I cannot test them right now, but when I used them in the past I noticed they exhibited what seemed to me uncommonly large pointwise divergence within elements, often greater then 0.1. I conjecture this degrades pathline accuracy in these methods, but I need to carry out appropriate numerical tests to say for certain. If only there were more hours in the day.

### Alizarin crimson

"Color in sky, Prussian blue, scarlet fleece changes hue, crimson ball sinks from view" — Donovan, *Wear Your Love Like Heaven*

I've always enjoyed this song with its exotically named colors, but I learned only recently from The Joy of Painting hosted by the legendary Bob Ross that Prussian blue and alizarin crimson are common oil paint colors.

### Box of rain

Here is another visualization of the merger of a pair of Lamb-Oseen vortexes. Mouse over this image to animate particle streaklines, a new feature that I just added to Cats2D. Streaklines are identical to streamlines and pathlines in a steady flow, but differ whenever the flow is unsteady. A streakline is the locus of particles released at the same spatial point but at different times in a flow. It represents an experimental situation in which a flow is visualized by continuously generating small bubbles or releasing dye at fixed positions in a flow. In this visualization I have arranged their release along a horizontal line through the center of the vortexes.

I've also computed pathlines in which a single line of particles is released at the start and the paths they travel are traced. Here is how pathlines compare to streaklines after 0.001 seconds of vortex decay.

More comparisons of this kind are shown here along with an animation of the pathlines.

### Allegro non troppo

Cats2D solves problems in two spatial dimensions, but we inhabit a world of three. Can a problem computed in two dimensions represent a real flow? It is an important question. A great body of work has been written on two-dimensional flows, far more than on three-dimensional flows. There is good reason to believe that many two-dimensional solutions do not represent real flows, even though they may sometimes share qualitative features with one. I restrict my attention here to laminar flows, because turbulent flows are always three-dimensional.

The nature of the two-dimensional approximation depends on whether we assume translational symmetry in planar coordinates, or rotational symmetry in cylindrical coordinates. The planar approximation assumes that the system is open and infinite in extent in the third dimension. Neither is true in practice, and real systems introduce end effects neglected by the planar approximation. In the cylindrical approximation the azimuthal direction closes on itself, so that systems of finite extent can exactly satisfy the approximation without end effects, provided the system geometry is rotationally symmetric. I will examine each of these situations in turn.

Let's start with a planar example, the two-dimensional lid-driven cavity. It's a great problem, but how well does it match the flow in a three-dimensional box with a moving surface? Not very well, it turns out, not even in a long box of square cross-section far from its ends. In a cavity of aspect ratio 3:1:1, illustrated below, laminar time dependent flow ensues at moderate Reynolds number around 825, and longitudinal rolls are observed in experiments for Reynolds above 1000. The flow is strongly three-dimensional everywhere above this transition, yet remains laminar below a transition to turbulence at much higher Reynolds number around 6000 (see Aidun, Triantafillopoulos, and Benson, 1991, Phys. Fluids A 3:2081–91 for experimental visualization of these phenomena).

We might think that flow at low Reynolds number is purely two-dimensional far from the ends of the box but it isn't. No-slip boundary conditions cause the flow to decelerate at the ends of the box. The flow is necessarily three-dimensional there. Viscous drag will affect flow all the way to the center symmetry plane of the cavity, though its influence may be weak there.

Let's examine more closely how this impacts the three dimensional form of the U and V equations at the center plane, where we assume that flow in the X-Y plane is symmetric with respect to Z, namely dU/dZ and dV/dZ = 0 at Z = 0. These conditions remove an inertial term from the left side of each equation, but a viscous term that is not present in the two-dimensional form remains on the right side of the equations. These terms represent the drag effect that the end walls exert at the center plane.

Now let's examine the W component of the momentum balance and assume that W = 0 everywhere on the center plane, which implies that all derivatives of W with respect to X and Y are also zero in the plane. This assumption eliminates most of the terms in the equation.

Invoking symmetry again, we let dP/dZ = 0 which leaves d^{2} W/dZ^{2} = 0 at Z = 0. Since W and its second derivative both go to zero at the center plane under the stated assumptions, W appears to be an odd function, namely W(Z) = -W(-Z), which is intuitively obvious. Most importantly, dW/dZ *does not need to equal zero under the stated assumptions*. The final inertial term on the left side of the equation vanishes because W = 0, irrespective of the value of dW/dZ. Therefore we do not recover the two-dimensional continuity equation at the center plane.

This means that flow is not divergence free in the plane, making it fundamentally different than the two-dimensional cavity flow. An important manifestation of the difference is instability of particles to lateral perturbations on parts of the center plane, which causes them to spiral away from it. To conserve mass, particles spiral towards other parts of the center plane in compensation, setting up a recirculation in the longitudinal direction. This flow is illustrated below by path lines and transverse velocity profiles computed at Reynolds number equal to 500 (from Yeckel and Derby, 1997, *Parallel Computing)*

The effect is pronounced at this Reynolds number, but is also present and tangible at much lower Reynolds number. Critically, any transverse flow at all will disrupt the flow topology observed in the purely two-dimensional cavity flow. The corner vortexes for which the lid-driven cavity is well known are no longer entirely closed. Particles recirculating through the ends of the cavity can move from the corner vortexes to the central flow and vice versa, rather than being perpetually trapped there. Better visualization and a deeper discussion of this effect can be found in Shankar and Deshpande, 2000, Annual Review of Fluid Mechanics.

The two-dimensional idealization totally fails to capture this important characteristic of the flow because it forbids lateral motion of fluid out of the X-Y plane. This restriction also allows unrealistically large pressure gradients to form in the X-Y plane. In the three-dimensional system freedom of lateral motion mitigates the build up of pressure, altering the size and shape of the lower vortexes and delaying the onset of the upper vortex to higher Reynolds number.

I have used this example to illustrate two common reasons the planar two-dimensional equations fail to capture the essence of a real flow in three dimensions. One is the presence of end effects, the other is an instability of the two-dimensional solutions to three-dimensional disturbances. But are these things always a problem? Not necessarily. Rotationally symmetric systems are free of end effects, because the third spatial dimension, the azimuthal coordinate, closes on itself. Solutions computed in cylindrical coordinates represent realizable three-dimensional flows of practical significance, unlike the more idealized planar approximation.

Furthermore, it is natural to express the azimuthal coordinate in trigonometric functions, making it simple to formulate the linearized stability response to arbitrary three-dimensional disturbances. This gives a useful bound on the validity of a solution in a rotationally symmetric system without needing to perform an expensive simulation in three spatial coordinates. Cats2D does not presently have such a feature, but I am contemplating adding it. Many of the physical systems I study are designed to approximate rotational symmetry, particularly in melt crystal growth. It is reassuring to know that solutions computed in cylindrical coordinates represent real three-dimensional flows. It would be even more reassuring to ascertain their stability to arbitrary three-dimensional disturbances. Let's call it reality-based computing.

### Catherine wheel

Here is a sample image from my latest fun problem, the merger of a pair of Lamb-Oseen vortexes (see Brandt and Nomura, 2007, to learn more about this type of flow).

Watch this animation of a line of particles released into the merging vortexes.

### Blah blah blah, Ginger

Time to start ranting about some of my pet peeves as a scientist. This one stems from a long running discussion with my father about U.S. drug policy, in the course of which he showed me this report published by an anti-drug think tank. The report presents a flowchart, shown here, described as a model of substance abuse. Don't get caught up studying it too deeply, though. There is less here than it seems:

Chemical engineers use flowcharts like this all the time. Usually the boxes represent unit operations, e.g. a heat exchanger, a reactor vessel, a distillation column and so on. The arrows typically represent flows of materials and energy into or out of the boxes. There is no ambiguity about the relationship or connectivity of the different parts of the flowchart when it represents a chemical plant. We can invoke conservation of mass and energy to make specific statements about how the unit operations in a chart like this interact with one another. However, we can do nothing of the sort with this model of substance abuse. When I look at it, all I see is this:

The flowchart purports to identify cause and effect relationships among the boxes that are much more ambiguous than implied by those arrows connecting the boxes. Thinking in terms of the chemical plant analogy, this model of substance abuse has leakage all over the place.

I think it is fine to study the relationships between academic failure and substance abuse, and I'm sure that strong correlations exist. What bothers me is that we don't cope with ambiguity well, so we feel it necessary to impose a cause and effect based explanation on anything that troubles us. Realistically all we can do here is to identify a constellation of factors that might play a role in academic failure, one of which is substance abuse. There is no rationale for putting substance abuse at the center of the flowchart instead of the other boxes related to parent factors, school factors, etc. For some kids, the "impulsivity/attention problems" box belongs at the center. For other kids, the "deviant peer affiliation" box belongs at the center. And so on.

I doubt that any general guidelines exist for cause and effect in this situation. Every kid could have their own custom flowchart with arrows running back and forth between everything in it. Trying to use this one flowchart to understand how academic failure relates to substance abuse for an entire population of kids is like trying to use the same flowchart to understand every chemical plant, regardless of whether it refines oil, makes nylon, ferments biomass, or whatever. It is not a useful construct. At its worst, it is blatantly misleading.

This sort of problem plagues the social and political sciences, perhaps because they are trying too hard to act like the hard sciences. It particularly bothers me here, however, because this work is masquerading as medical science, which I think it plainly is not. I think a flowchart lacking a well defined set of connections among specific things or actions is a red flag. Sometimes I think we can do better in words than in pictures. This is one of those situations.

### Electric kool-aid acid test

It's interesting, whatever it is.

### Éminence grise

Let's flip through the Rolodex and see who we find there. Bunch of underachievers and malcontents, I bet. Here's one:

Nickname: *Brainy Smurf*

Institutions: *American Academy of Arts and Sciences, Harvard*

Observations: *Good editor. Responds to rewards. Resist his attempts to meddle with content or semantics.*

Other:* Knows the foreign minister. Occasionally refers to himself as Huck Finn.*

### Temba, his arms wide

I remember a day in graduate school more than thirty years ago when I came upon an eager puppy wagging his tail in Stan Middleman's lab at UCSD. It was Goodwin. He was fresh out of the Navy and he looked it. Either that or he was auditioning for a ZZ Top cover band. Apparently there was a bit of a competition in the sub fleet to grow these things, and Goodwin excelled at it.

We were from different walks of life, but my seven years spent working in beach area restaurants in a border town had prepared me for almost anything. For Goodwin it was the surfriding public he witnessed daily outside his Ocean Beach apartment that provided the Rosetta stone to my behavior.

Goodwin was the smartest undergraduate I ever met in the chemical engineering program at UCSD, besides myself of course. He was 29 years old, he had passed the Navy's notoriously difficult nuclear propulsion school, and he had served on a fast attack sub. I pity the other students in his class. Just kids! How could they possibly compete. It's a wonder they didn't assassinate him.

On the other hand, he was a teaching assistant's dream. Need a key to grade the homework? Just pull out Goodwin's work, knowing with confidence that it would be free of errors. He makes so few errors, even today, that I must strive to relish each one as I come upon it.

I find it strange that Goodwin and I have never coauthored a research publication. My published works list over sixty different coauthors yet he is not among them. This is unfortunate. Our intellectual and practical collaboration on Cats2D suffuses much of my professional work.

In each paper I've coauthored based on Cats2D simulations, Goodwin did not meet the customary standards for inclusion in the author list. His contributions merited a citation of the code in the reference section. These have piled up over the years, over fifty times now, so anyone who follows my work is likely to notice this relationship. But I think it is not enough.

The methods and algorithms that make Cats2D so special took us years to develop and master. The shared experience of doing it together planted ideas that continue to manifest in my research. I constantly draw on this well of knowledge to practice the art of simulation on complex, nonlinear problems. At any moment I might rely on something Goodwin did years, or even decades, ago that allows me to succeed at a simulation today.

This intellectual synergy deserves recognition, but it takes so long to complete the circuit that Goodwin's contributions end up buried by the passage of time. At the root of the problem is an outdated dissemination model in which we package our research into rigidly formatted articles suitable for the printing press. This totally linear exposition of our research is frightfully limited. It is like streaming data to a serial access tape, which is to say awkward, archaic, and unnecessary.

The entry titled "Turn on, tune in, drop out" further down this page discusses this issue at length. There you can learn about my preferred solution, which is exemplified by this web site. I can publish as much as I want, and I can organize it however I want. This allows me to place Cats2D alongside the research where it belongs, instead of beneath it where Goodwin does not get fair recognition. In a stream of conventional research articles Cats2D fades into the background when in fact it is central to the work every time it is used.

### Dueling jets

Here is a preliminary look at something interesting from the Goodwin archives:

This problem is a crazy quilt of turning points. Cats2D Artworks Division would like to see more.

### Cogito, ergo sum

Someone at Quora asked "How do you judge an experienced C programmer by only five questions?" Frankly, it's a dumb premise, but I played along and read the responses, many of which offered more than five questions (I was surprised at the number of respondents who casually violated this stipulation without comment). I found myself unable to answer most of these questions. I couldn't even recognize what was being asked in many of them. Apparently I wasn't a C programmer at all, just a clueless hack posing as one.

This didn't seem right to me, especially with all the bragging and boasting about Cats2D that goes on around here. It's fast, efficient, robust, versatile, extensible, and more. It is written in C, and only experienced programmers could have written it. So what is it about my C programming that is so different from the C programmers who wrote these questions?

I quickly recognized that many of these questions related to things that the C language was developed to abstract away from our view. It turns out that many people who gravitated to this question were embedded device programmers. This is a very distinct branch of programming in which a code interacts directly with hardware. All sorts of unusual requirements apply, demanding the most exotic elements of the C language, and a good understanding of the hardware too. These programmers do things and worry about things that have nothing to do with C programming per se.

But I also noticed in these questions some cultural characteristics common to the programming world at large, in that they were often painfully parochial, and not infrequently more concerned with parlor tricks than real world programming. In my world of scientific application programming, if you are using the longjmp command, fiddling with bits, or nesting unions, then you are doing something wrong.

Here are my hypothetical responses to some of the questions I saw:

Q: Can you write two functions in which one executes before main and the other executes after main?

A: No.

Q: Suggest techniques to accelerate IPC between two threads and guess the throughput per second.

A: No.

Q: Swap two variables without using a third variable.

A: Why?

Programming can mean so many different things. It is easy to find good advice on basic programming techniques. It is much harder to find good advice on software architecture or development strategies for large applications. Much of this advice suffers the same parochialism as the questions I talk about above, and does not apply well to scientific application programming.

An example of such advice is given by a popular author who recommends that functions be limited to twenty lines and do only one thing. Abiding by these restrictions would wreck a code like Cats2D. In a scientific application of its size and complexity, functions that are several hundred lines long are often needed to control execution flow among a large number of branches. For example, the function that controls assembly of the residuals in Cats2D is over 700 lines long because of large switch statements over the field equations and their boundary conditions, yet it is very easy to understand, maintain, and extend. This function is the right length for its purpose. Imposing a twenty line limit is arbitrary and inflexible.

Too much other rule-based advice is like this. Someone says that a "clean" code has no loop control statements in it (for, while, etc.) because these should always be handled by library functions (e.g. BLAS) or other constructs. Other people think looping is okay, but believe that nested loops are the devil's handiwork. Imposing rules like these on a code like Cats2D is unworkable.

Unfortunately the best I can come up with on my own are vague and generic guidelines. Make your functions short whenever doing so makes sense, and make them long whenever necessary or desirable. Use loop control constructs if needed, avoid deep nesting if possible, and make exit conditions easy to understand. Organize related data into structures. Handle pointers with care, never put anything big on the stack, and clean up memory when you're done using it. Above all else, don't write spaghetti code.

Nobody finds the sweet spot among these guidelines the first time they write a useful application. Once in awhile someone is lucky enough to start a new code from scratch that anticipates all the failures encountered on the previous attempt. Much of the time, however, the existing code is refactored rather than replaced, allowing major flaws of the original implementation to linger on. One workaround after another wastes time and makes spaghetti code worse.

I don't envision classroom experience doing much to teach programmers how to get it right and avoid spaghetti code when they create an application longer than a few thousand lines. It is so easy to paint yourself into a corner, and you will do it again and again before you learn how to anticipate and avoid it. Accumulating this experience is difficult, especially on your own. In my fourth decade of programming I am still struggling to learn how to write good code. Many a time I wished for a suitable mentor but none was to be had.

I think a lack of mentorship in software development for physics-based computing is a big problem. Application codes in this area are too complicated and computing intensive to approach casually. Mastering good programming technique is essential, but not enough. The step from the classroom to working on a complicated application code is too big for most people to make on their own. Mentorship from an expert application developer could make a world of difference here. A good mentor would have saved me years of grief learning this craft.

But I have never seen any attempt to mentor academic trainees this way. They're barely taught basic programming technique, if that. There seems to be little concern about it, or even recognition that a problem exists. Yet there is significant spending on computing education by the national agencies. Classes are taught, hardware is provided, research is funded, but something is lacking in the coherence and purpose of this spending.

Things have gone on this way so long now that I see no reason to expect change, especially when no one seems to be talking about it. A big part of the problem is that many people working in scientific computing are stuck on the left side of the peak in the Dunning-Kruger curve:

I spent my first ten years climbing this peak, often called "Mount Stupid" for good reason. I spent another fifteen years wallowing in the trough of disillusionment for want of a good mentor. I am clawing my way out of the trough now, and finding a mentor no longer seems so important. It is a protégé I want, but given the general state of computing education there is no one to be found.*

* I know Goodwin seems like an obvious candidate for this prestigious position, but he is ruled out by the "under 60" age restriction I've placed on it.

### Xanadu

Cats2D is shown here harvesting the fields of knowledge for our intellectual sustenance.

I'm in the driver's seat , but my wheelman Goodwin takes it out for a spin sometimes, too.

### Starfish fight

Ever seen one? Starfish really do fight, just very slowly. Speed it up and things start looking a bit more brutal. Here is a time lapse movie made by legendary jade hunter Don Wobber showing the action. Wobber's work was instrumental in convincing biologists that starfish could exhibit social interactions.

I learned about this from Wobber himself when Lisa and I crossed paths with him thirty years ago on the central coast. We were chatting beside our vehicles at a roadside turnout after helping him carry his gear and booty up a steep trail from the rocky cove below. Of course Lisa, who is in a competition with Goodwin for biggest know-it-all ever, already knew this stuff about starfish behavior and said so.

It was the dead of winter and a ferocious storm had just blown through. The tree shown here fell during the night just fifty yards from our tent. It was loud and scary. After breakfast and the usual lolling about the campsite, we headed out for some quality beachcombing. In view of the season and the ominous weather, we expected the cove to be deserted. Looking down from the top of the cliff it did appear to be deserted at first. Then we noticed a small figure in scuba gear being thrown about by the waves. It was struggling to drag something ashore.

By the time we reached the bottom of the trail the mysterious figure had finished pulling ashore what appeared to be a small boulder attached to a float. As we took in the bizarre scene we began to notice more details: other small boulders lined up at the foot of the cliff, additional diving gear, and more ropes. This otherworldly creature in his frogman gear approached us and said "hello, how are you folks today?" It was a sixty year old man.

Understand, these were not ordinary conditions. Big Sur is a cold, dark, spooky place when it storms in winter. The water is a churning mess. It is dangerous diving, especially alone. But it was the churning that brought Wobber out, because it disturbs the bottom enough to uncover boulders of jade that may have been buried there for eons. This day he had taken in quite a haul.

We carried as much as we could up the trail and left the big pieces below. He would return later that morning with hired hands to hoist these up the cliff. A few of the larger pieces might have weighed a hundred pounds or more, and they had been hauled out of the murky depths by this larger-than-life character we had just met by chance on a deserted stretch of rocks. For our meager help he offered us the smallest of the pieces we had carried up the cliff, shown here.

Anyway, back to the starfish story. Until Wobber made his film, nobody really grasped what was going on with starfish behavior because nobody was studying them at the right time scale. Only when Wobber compressed starfish fights into a time scale more familiar to us could we visualize what was going on. Seen that way it was clear: these fellas were slugging it out for the best spot on the reef.

I am planning to post more thoughts soon on the subject of time scale and how it affects our perceptions when visualizing dynamic phenomena.

### The Prestige

Every great magic trick consists of three parts or acts.

### Eating the seed corn

I have seen very little advancement in the quality and impact of physics-based computing over the past twenty years. Computing hardware has improved dramatically and people are solving larger, more complicated problems than ever before, but the lack of genuine intellectual advancement is appalling. So many things have gone wrong I hardly know where to begin.

Commercial software for structural and continuum mechanics has advanced significantly, but remains quite expensive. Single user, single processor licenses cost tens of thousands of dollars annually, and the extra charges for multiple users and multiple processors are steep. This kind of software remains difficult to use, and setting up problems is often labor intensive. CAD, mesh generation, and post-processing often are handled by separate applications. Workflow becomes a significant issue when moving data through a chain of applications to carry out these steps.

Personally I think the commercial products are too much of a Frankenstein. Nobody has been able to streamline this kind of analysis into an intuitive, low effort workflow that can be trusted to produce reliable results. It is possible, with some effort, to carry out impressive and useful simulations with these products. It is also easily possible to generate solutions of low reliability and low utility, and that is what often happens.

Let me be clear: I am not criticizing the people who create and sell these products. It took smart people many years of hard work to build this industry. They've identified a market and found a way to serve it. But, like so many other things, these products are creatures of their environment, which in this case is burdened by a poorly trained user base. Every company in this industry expends considerable effort training their customers, making user support a large part of the high cost of these products. But it isn't enough.

Effective use of these products requires bona fide skills. Fluid mechanics is hard. It requires formal study. It takes time to absorb. Often it is nonintuitive, and subtleties abound. Throw in convective heat and mass transfer and things start to get really complicated. Everything is coupled and nonlinear. The notion that code => visualization => understanding is a straightforward path is deeply misguided.

I have reviewed many papers, primarily written by academic researchers, in which commercial products have been used badly. Problems are formulated incorrectly, physical models are chosen inappropriately, algorithms are used with bad parameter settings, and so on. And always lurking in the background are space and time discretization error, which often are far worse than casual users of these products imagine.

Two things stand out to me from all the papers I've reviewed. Call them red flags. One of these is a failure to properly use the divergence and gradient operators when writing conservation equations. Example: writing the continuity equation as grad(**u**) instead of div(**u**). Seriously. I've seen this many times and as soon as I see it I know the rest of the paper is going to be full of mistakes. The other is a near complete absence of boundary conditions. Whenever I see this I proceed from the assumption that if they don't write them down, they don't know how.

In my view people who don't know the difference between grad(**u**) and div(**u**) are unlikely to carry out or understand the basic checks to which any CFD solution should be subjected. Does it satisfy the putative boundary conditions? Does it satisfy conservation of mass locally and globally? Does it violate any other obvious physical constraints? Do the time scales seem right? Novice users often fail to notice obvious defects of this nature in the solutions they compute, even when I spot them easily in the figures they've presented.

Nothing is more telling than the abysmal presentation of data generated by these codes. The ugly and useless plots I've seen indicate there is plenty of blame to go around. Velocity vectors show nothing. They're terrible. Yet judging from the frequency of their use in papers I've reviewed and conference talks I've attended, they are the de facto standard in CFD. Why? They're easy. High quality streamlines or path lines are far more informative, but they are much harder to execute well, so they are used far less often.

A more serious, better trained user base would demand better visualization, better solution analysis, and better solution diagnostics than are generally offered. Everyone concerned would benefit. This goes along with the allied need for better trained workers to develop the physics-based simulation tools of the future. Unfortunately these things are unlikely to come about without a sea change in how scientific computing is taught and incorporated into research within our universities.

The basic problem is that faculty principal investigators have neither the time, nor the inclination, to actively engage in software application development. Those who try are usually driven out or marginalized. These aspects are delegated to trainees who are expected to educate themselves in code development. In this research paradigm, the state of knowledge of the principal investigator decays while the state of computing marches forward. The result is that senior faculty who were anointed experts long ago based on a few years of serious computing are no longer experts at all. As a consequence the intellectual scaffolding for this kind of work is worse now than it was twenty years ago, though it should be much better.

Much of computing can be self-taught, or learned with modest instruction. In contrast, working at the forefront of physics-based computing requires years of experience, a sound mentorship, and ongoing professional development. Staying close to the software is essential. These skills aren't like riding a bike. Stop using them and your entire intellectual portfolio in computing will atrophy within a few years.

*Armchair generals.*... I realize what I'm saying here is harsh. But let me come at it a different way. Consider Taylor or Batchelor, for example. They used classical methods in mathematics to solve many interesting problems in fluid mechanics. Did they ever subsequently lose their ability to solve any of these problems after they had solved them once? Did Van Dyke forget how to apply perturbation methods, or did Acrivos forget how to match asymptotic expansions? I bet not. But it is perfectly normal that a faculty principle investigator is unable to reproduce computational solutions obtained by a trainee because of a lack of relevant computing skills. When that trainee departs, or soon after, the PI loses the ability to reproduce those solutions, at least not without great effort.

I submit that this situation is pathological. If someone purports to be a computational physics expert of some sort, that person ought to maintain the ability to reproduce previous analyses as well as conduct new, novel analyses. In my view of professional development, the acquisition of such abilities should be cumulative and progressive, not sporadic and ephemeral. Physics-based computing is still a new area, full of surprises and unforeseen challenges, carried out on a constantly evolving playfield of hardware advances. Mastery is never achieved, it is only pursued. Stopping the chase guarantees a rapid decline in ability to lead.

### A model of stability

Been married thirty years today. Same woman, too. But here is something to think about: I met my wife the *same month* that I met Goodwin in 1985. How weird is that? What a couple of characters they are, too. Here is a grainy surveillance photo of Goodwin lurking at our wedding in 1988.

### Divine provenance

Let's talk about who, what, where, when.

I have twenty five versions of Cats2D archived between 1994 and 2014. I've been trolling through the oldest ones. Wow did we pound the keyboard in those days. The 1994 version was already 80,000 lines long. We called it Charisma back then. In 1995 Goodwin put an online user manual of sorts on the internet, when he was a visiting professor at ESPCI Paris. It is no longer there, but I found it stashed on my machine and have resurrected it here (this stuff is so old it is hand coded in HTML 1.0). Compare the 1994 main menu to the 2014 version. They're nearly identical. In fact, *all *of the menus are nearly identical.

Before 1994 we just called it The Code, but we needed something more serious so Goodwin named it Charisma and I splattered copyright notices all over everything. The first paper I ever wrote on crystal growth modeling cites the Charisma user manual. But after writing that paper in 1994 I quit working on the code. Goodwin toiled away for another year before he quit working on it too. Charisma lay stagnant for the next five years, and, except for a few small consulting jobs and side projects, it remained unused while I developed parallel codes for experimental machines such as the CM-5 and T3E that are now obsolete.

In 2000 a few students began to use the code and in 2002 I started working on a new user manual. Unhappy with the Charisma name, I renamed it Cats2D, and from then on it was cited that way in all our publications. In 2003 I began distributing the new user manual online at a web site since taken down. But we can use the marvelous Wayback Machine to visit this page as it was cached by a web crawler on February 22, 2003.

The digital traces of Charisma/Cats2D can be found elsewhere, too.

### Just sayin'

Two years ago I was contacted by a Cats2D user who ran into convergence problems trying to simulate a two-phase flow with a flat gas-liquid interface. Since I'm starting to work on a similar problem myself I will explain here how it is done properly. This is a situation where imposing a simplifying assumption introduces a pathology in the problem formulation that is not immediately obvious. It is also a situation where inferior numerical methods are likely to mask the pathology, which is always a dangerous state of affairs in physics modeling. Fortunately the numerical methods used by Cats2D give a strong signal that something has gone wrong when this happens.

In general the shape of the gas-liquid interface at equilibrium is determined by a balance of forces acting on it. Surface tension and hydrostatic pressure are always important, but any effect that induces a force at the interface can be important, e.g. viscous stresses and electromagnetic forces. But to a reasonable approximation the surface of a liquid in a large container is held nearly level by gravity, except at the container wall where there may be a small capillary rise.

Imposing a flat interface in lieu of computing its shape greatly simplifies the problem, since all the difficulties associated with a deforming mesh go away. But we can only do this by neglecting the normal force balance at the interface. This is troublesome because the pressure field is not continuous at a gas-liquid interface. Within the model formulation, the pressure jump affects only the normal force balance at the interface, and becomes unspecified when we drop this equation from the model. This decouples the pressure between gas and liquid.

So what's the problem here? We imposed a flat interface and we know that the pressure jump is zero for a flat interface. It seems so obvious. But the model formulation does not "know" anything. When the normal force balance at the interface is neglected in the model, pressure in the gas may be shifted by any constant relative to the pressure in the liquid and the solution will remain valid. This means that the "simplified" model admits infinite solutions even though the solution to the original model is unique.

Expressed in mathematical terms, dropping the normal force balance introduces a rank deficiency in the stiffness matrix. The system of equations is consistent but underspecified. The behavior is just like that observed in a closed flow for which it is necessary to constrain the pressure level to select a unique solution. This is usually done by setting the pressure to zero at a single point, which is sometimes referred to as setting a pressure datum. Since pressure has become decoupled in our simplified model, it becomes necessary to specify a pressure datum in both the liquid and the gas. Imposing a second datum seems unnatural but is needed to correct the rank deficiency.

Cats2D automatically imposes a pressure datum under normal circumstances when it detects a closed flow, but it fails to consider this pathological scenario. I could and probably should fix the code to detect it, but it might be a bit trickier than it seems at first glance. The good news is that the code will fail to converge to a solution in a manner that is easy to recognize by the constant drift of the pressure level from one iteration to the next. There is of course some responsibility on the part of the user to understand these things and remain vigilant at all times.

### Code Talk with Foo and Bar

I've been thinking of running an occasional feature inspired by Car Talk in which Goodwin and I engage in some humorous banter about the many ways to terminate a linked list, when to use a register variable, when *not *to use the break statement, and especially *TV2++ = -T2* *TV3++ + T1* *TV4++. Hilarious, no? But there is never enough time in the day for everything.

Foo and Bar are known in programming circles as *metasyntactic variable*s, which are placeholders for actual names, in this case our surnames. I'm not sure who is Foo and who is Bar, though. We should probably ask our wives to sort it out for us, but to prevent a segfault we'll need to stipulate that only one of us can be Foo and only one of us can be Bar at any one time.

The picture on the left shows Goodwin's handiwork with needle and thread. Thorough, careful, functional, there is even an aesthetic of a sort there. Never watch him coil a rope or let him watch you coil a rope. You'll feel terrible either way. These qualities relate well to his style of programming. My work is better exemplified by the picture on the right, in which I am struggling to survive in a giant chaotic landscape. Whereas I merely avoid bad programming, Goodwin seeks good programming.

When Goodwin first started on Cats2D, long before it had a name, he decided it needed its own linear equation solver. This was unusual and I told him so. Why reinvent the wheel? This is 1991. No one writes their own solver any more. He explained that a linear equation solver was too important to leave to someone else. It had to be fast, reliable, and portable. He achieved these things by constructing it in the most meticulous fashion, like that needlework in the picture. His solver was better than the other solvers. It justified itself in the end. We used it for almost a quarter century before he wrote a better one.

Goodwin's not-invented-here syndrome has proven a good thing. We've ended up writing practically everything from scratch, but we always know what we are getting. The code remains highly portable because it depends on so few external resources. We can pop under the hood to fix a bug, or refactor the code whenever experience suggests a better way to do things. These things increase the utility of the code and decrease the effort to maintain and expand it. The initial high cost has been well amortized over the long lifetime of the code.

There are intangibles to consider, too. Because there are only two of us, we always know who to blame for the bugs: the other guy.

### Bread crumbs

A few years ago I was contacted by someone who insisted the code was unable to properly solve a particular problem. The user had let his support contract expire, but I explained to him I was certain the code did solve this problem correctly if the instructions in the user manual were followed. I don't know if he ever figured it out, but I am going to briefly explain it here.

The problem is to model the Gibbs-Thomson effect at a solid-liquid interface, which relates the local melting temperature to interface curvature. A write up of an original technique I developed to incorporate this phenomenon into a finite element discretization of the Stefan problem is found here. When solving this type of problem it might be desirable on physical grounds to impose a value on the geometric angle at which the melt-crystal interface intersects the ampoule wall. Typically this is intended to represent a wetting angle determined by surface energetics at the triple phase line. There is an analogy here to the static contact angle of a liquid wetting a solid surface.

So I ran the following tests. First I solved the problem without the Gibbs-Thomson effect. The thermal conductivities are 1, 2, and 3 W/m K in the crystal, melt, and ampoule, respectively, which are representative of cadmium telluride in a quartz ampoule. Results after some period of time integration are shown below. The mesh and interface shape are shown on the left and the planar melting point isotherm is shown on the right. The jump in conductivity at the melt-crystal interface causes the temperature isotherms to bend there. Without the Gibbs-Thomson effect the growth interface coincides with the isotherm. Ordinarily we do not specify the angle at which the interface meets the ampoule wall. Instead we assume that the interface simply follows the melting point isotherm all the way to the wall.

Next I added the Gibbs-Thomson effect, which lowers the melting temperature wherever the interface is convex, and raises it wherever the interface is concave. The results are shown below. Since the interface is concave everywhere in this problem, the melting temperature is raised everywhere, but more so near the wall than the center. The interface no longer coincides with the planar melting point isotherm. It has flattened somewhat to reduce its curvature. But it still obeys the Gibbs-Thomson relationship up to the ampoule wall.

This approach ignores the energetics at the triple phase line, which can cause the interface to bend sharply away near to the wall. In the language of differential equations, it constitutes an outer solution to the interface shape. The inner solution is determined by minimizing the surface energy at the triple phase line, which depends on the ampoule material and the condition of its surface. This inner solution is often represented by a phenomenological wetting angle found experimentally for a specific combination of ampoule and crystal materials. Setting this angle to 60 degrees gives the results shown below when the Gibbs-Thomson effect is neglected. The interface follows the isotherm except close to the wall where it turns sharply downward to satisfy the wetting angle condition.

Next are shown results for a 60 degree wetting angle with the Gibbs-Thompson effect added. The deviation between interface and isotherm grows larger to reduce the interface curvature.

The mesh is generally well behaved in all these results, even after time integration with substantial mesh motion. But the mesh has distorted slightly in the ampoule wall, which might bother a fussy analyst. Should this happen it is possible to add a control along the red line to force it to remain horizontal, as shown here:

Obviously the user made a mistake when he insisted that Cats2D could not solve this problem correctly, even after I assured him it could. This is where reading the user manual comes in. Page 130 would be a good place to look, although in this case a strong argument could be made that starting on page 1 is even better.

### Lying in the weeds

It has been all things temporal around here lately. It started when I was doing some routine calculations of segregation in a vertical Bridgman system and found that a time-dependent oscillatory flow (a limit cycle) was lurking behind what appeared to be a perfectly normal quasi-steady flow. Let me explain what I mean by quasi-steady. In a batch system like vertical Bridgman, the geometry constantly changes as the melt is depleted, so system behavior is inherently time-dependent. But the growth rate is typically very slow, less than 1 cm/hr, whereas the flow can adjust on a time scale of a few seconds and the temperature on a time scale of a few minutes. Because the other fields respond so rapidly compared to the change in geometry, the time derivatives of these fields remain small and the solution at each time step very nearly satisfies the steady-state equations.

When this situation prevails the size of the time step is determined by the rate of change of geometry, which happens so slowly and smoothly that step sizes up to 20 minutes give very accurate results. I've tested this using the variable time stepping algorithm of Cats2D with three different error targets. In all cases the initial time step was 300 seconds. The plot on the left shows the time step size versus integration time for targets of 10^{-3} (green curve), 10^{-4} (red curve), and 10^{-5} (blue curve). The plot on the right shows the evolution of the average flow velocity over the course of the integration. The time step remains large and the time integration completely fails to identify the limit cycle, yet the average velocity is accurately calculated, especially for the smaller targets.

Gresho, Lee, and Sani (1979) recommend a target of 10^{-3} (0.1%) for the relative norm of the truncation error at each time step. This measure might seem small, but it represents only the local accuracy of the time step, not the cumulative error of the time integration. It was recommended at a time when computers were several orders of magnitude slower and coarse solutions were accepted as the best available. The default value used by Cats2D is 10^{-5}, which reduces the time step by a factor of 4.6 compared to 10^{-3}. Yet even this value failed to detect the limit cycle, because the time integration steps over the unstable period in the first hour of growth before the instability can grow.

To prevent this from happening, a smaller initial time step is needed. Reducing it from 300 seconds to 30 seconds has the effect shown below. Now the instability develops for error targets of 10^{-4} (red curve) and 10^{-5} (blue curve), but it takes longer to develop for the larger value, as seen in the plot on the right. At 10^{-3} the time step still grows too fast to detect the instability and steps over it.

Here are results for initial step size reduced to 10 seconds. Now the instability develops for all three error targets, but at 10^{-3} it develops more slowly, and is also not very accurate. One obvious conclusion is that 10^{-3} is simply too large for accurate work. But it is also apparent that simply using a smaller error target is not enough to guarantee capturing the true dynamics of a problem.

The situation here is problematic. An initial step size of 300 seconds is perfectly reasonable, even a bit on the conservative side, for this problem, but to detect and accurately resolve the limit cycle requires a step size under 3 seconds. It demonstrates the constant need for care and vigilance when studying nonlinear systems.

### Time marches on

I have written some new notes that explain which time integration methods are used in Cats2D, and how these methods behave. This is a topic I never found the time to write about in the user manual, so eventually this material will end up there. The notes are an outgrowth of my recent work reported below on an oscillatory flow that emerges during time integration of a vertical Bridgman system. As part of that work I ran some tests comparing backward Euler integration to the trapezoid rule, which is the default method in Cats2D. This got me interested in the second order backward difference formula (BDF2), a popular alternative to the trapezoid rule for solving stiff equations. So I installed it in the code and ran some tests comparing it to the trapezoid rule.

My first task was to use Richardson extrapolation to estimate the exact solution. I did this using solutions computed by the trapezoid rule at step sizes of 0.1 and 0.05 seconds. The error estimated from the difference between this extrapolated solution and a computed solution is shown below for step size of 0.25 seconds (the blue curve) and 0.05 seconds (the red curve).

.

For convenience I have scaled the error by a flow magnitude of 1 mm/s. The time average of the actual magnitude is slightly smaller than this, around 0.9 mm/s, so the raw error is nearly equal to the percent relative error. The maximum value of the blue curve is 0.0072, about 0.8% relative error, and the maximum value of the red curve is 0.00027, about 0.03% relative error. The time steps differ by a factor of five, so the error should be 25 times smaller for the red curve, and these numbers bear this out, giving a ratio of 27.

This next plot shows how BDF2 (the blue curve) compares to the trapezoid rule (the red curve). A step size of 0.25 seconds was used in both cases:

The BDF2 error is four times larger, as predicted by theory. To achieve a comparable accuracy to TR it is necessary to halve the step size to 0.125 seconds for BDF2. When this is done the error is nearly identical, such that the blue curve cannot be seen beneath the red curve, at a cost of doubling the number of time steps.

In my work I want this kind of error to be small, no more than a percent or two. For this problem, the trapezoid rule with a step size of 0.25 seconds satisfies this desire. There isn't much value in reducing the error any further, so a smaller step size is overkill. Obviously there is not any advantage to using BDF2 for this problem, but I have found a completely different problem where BDF2 appears to give the correct result and TR does not. I will report on this result when I understand it better.

### Keep calm and carry on

Goodwin has been bugging me lately to add new material to the web site, or the "public face" of Cats2D, as he calls it. According to him the place is stale and neglected, verging on a ghost town. What a bunch of gaslighting. The site is just catching its breath after a prolonged run of new results, fruitful ideas, and insightful commentary. He should read the entire site from end to end before he complains again. In the meantime, everyone else can read this interesting page featuring my work from the early 1990s.

Thank goodness I'm in charge around here.

### Shaka, when the walls fell

Personal computers became widely available in the 1980s, around the time I started programming. These clunky boxes with their command line interface were mysterious and intimidating to most people. The learning curve was immediate: you couldn't do a thing without learning and memorizing all sorts of obscure commands. Using one was more likely to make someone feel dumb than smart. Even after Apple introduced a more approachable graphical interface, computers were not particularly easy to use.

Some interesting things were done with computers in engineering simulation in the 1980s. Several commercial CFD codes were launched, including Fidap and Fluent. Some researchers, many of them graduate students, succeeded at writing single use codes to solve significant problems. A superficial feeling spread in the community that problem complexity was limited more by available computing power than anything else. People were frequently saying things like "the extension to three dimensions is obvious" without having the slightest idea how difficult it would prove in practice to write, or even conceptualize, a framework for studying 3D problems of arbitrary geometry.

Steve Jobs' drive to bring personal computing to the masses finally succeeded with the introduction of the iMac. By then everyone in the industry had learned much about user behavior that had gone into improving interfaces for the OS and user applications. Hardware was cheap and reliable. Pretty soon Grandma and Uncle Joe were shooting off email and zipping around the internet like pros. Computers were now usable enough that you were more likely to feel smart than dumb using one. It took twenty years to reach this crossover point.

Then something happened that made a lot of sense. Engineering departments stopped teaching the workhorse programming languages. The emphasis shifted to spreadsheets, interpreted languages, toolkits, and simulation applications. This is how engineering is mostly done nowadays. Unfortunately for the small group of people interested in developing physics-based simulation applications, this curricular change de-emphasized the core languages at a time when architectural changes in computers, particularly the introduction of parallelism, was making application development for high performance computing much harder than ever before. The results have been devastating to the progress of physics-based computing.

One of the greatest impediments to reform is a near-universal cultural understanding of computer programming as the quick and easy province of the young. Legends abound of self-taught savants writing brilliant code by their teen years. Most of the time this is a Rubik's cube sort of brilliance, in which a few skills are honed to perfection on some narrow, easily comprehended target. Much of code development at a professional level is unlike this. Read what prolific Quora contributor and software engineer Terry Lambert has to say about computer science education nowadays. The situation is every bit as bad in engineering and other STEM disciplines.

In future entries I will continue to explore how things got so far off the rails and why I think the problem can't be fixed.

### Don't touch my eigenstuff

Eigen- is a great prefix. Usually it goes in front of *vector*, or *value*, but it can be put in front of just about anything. Take eigensolver, for example. It says exactly what we want it to say. A prolific commenter at a national news site has adopted the nom de plume Eigenvector. I wish I had thought of that.

Speaking of which, Goodwin, in one of his worst ideas ever, suggested that I adopt a nom de plume for this web site. I came up with one, and it was Andrew #!@$%^& Yeckel. He never mentioned it again.

I've posted four entries recently on the subject of eigenanalysis, including an interesting story about a Hopf bifurcation to an unstable, periodic flow (a limit cycle) that I found in a vertical Bridgman system operated in a destabilizing temperature gradient. I've put together some notes on linear stability analysis and the eigenproblem to help the non-expert understand these results better. Eventually I will collate these materials into an article for my research topics page. Things there are a mess now, but I'm just getting started.

### Olly olly oxen free

In my "Breaking bad" entry posted below I presented observations of an oscillatory flow that emerges during time integration of a vertical Bridgman system. My follow up entry "The usual suspects" analyzed this flow using linear stability analysis to confirm the occurrence of a Hopf bifurcation leading to a limit cycle. Finding the leading eigenmode was not particularly difficult, but only because I already knew it was there, and knew its frequency too. Thus I was able to quickly find a suitable shift far enough to the right of the imaginary axis to locate the desired eigenmode. Looking blindly would have been more difficult, and I might not have realized to look at all, had I not already stumbled onto the limit cycle by direct time integration of the nonlinear equations.

Generally it is impossible to compute the complete spectrum of eigenvalues for the large matrices we ordinarily use in CFD. In principle there are as many eigenvalues as degrees of freedom in the discretization, so we are computing only a small fraction of a percent of them. Thus it is vital to search in the right places by adjusting the shift thoughtfully.

Let's start striking the piñata. Eigenvalues found using four different shifts are shown in the diagram below. The shift is shown by the open symbol of a type, and the eigenvalues found using that shift are shown by the filled symbols of the same type. What we find depends entirely on where we look.

The blue circles are found using zero shift. These are clustered near the origin, but we aren't particularly interested in them. They are slow decaying non-oscillatory modes associated with latent heat released at the growth interface and are not the cause of any instabilities in this system. Even though they are close to the imaginary axis, they are not a threat to cross it as long as the temperature gradient at the growth interface remains large enough to maintain morphological stability. The decay time of these modes ranges from about 10 minutes to 2 hours.

The green triangles are found using a shift on the imaginary axis at 0.1i. This shift mostly picks up modes with a decay time around 10 seconds, a few of which oscillate slowly with periods of hundreds of seconds. These modes are associated with thermal transport. The blue diamonds and orange squares are found using higher frequency shifts of 0.2i and 0.5i, respectively. These shifts reveal numerous modes decaying in 3 to 10 seconds and oscillating with periods ranging from about 10 seconds to 2 minutes. These appear to be predominantly hydrodynamic modes.

Turning off latent heat eliminates the slowest decaying modes while leaving the rest of the spectrum undisturbed, showing these modes do not interact much with the thermal transport and hydrodynamic modes. When this is done the longest decay time is reduced from about 9000 seconds to 68 seconds, a time characteristic of thermal transport.

In all this exploration we have not found the unstable complex pair of eigenvalues because we have used a shift to the left of this pair on the real axis. Repeating these tests with the real part of the shift set to 0.05, just slightly to the right of the known position of the leading eigenvalue, shows that an imaginary shift of 0.1i finds it, but the other values tested do not.

The leading mode is a narrow target to find and I might have missed it altogether were it not for my discovery of a limit cycle during time integration of the nonlinear equations. And I missed that at first, too, until I started playing around with the time step as described in the "Breaking bad" entry below. This is somewhat disturbing, and it's a reminder of the hazards inherent to analyzing nonlinear equations.

Besides nonlinearity, what also makes this problem challenging to analyze is the large separation of time scales of the various physical phenomena that are involved. Interface motion occurs on a scale of hours, thermal transport occurs on a scale of minutes, and hydrodynamics occurs on a scale of seconds. A large number of eigenvalues that represent uninteresting modes are clustered near the origin, making it harder to find the unstable mode we seek.

This all reminds me of something.

Ultimately understanding the eigenspectrum is about understanding the time scales of a problem. But it seems like we need to have a good understanding of those time scales before we study the eigenspectrum. Otherwise we're just guessing where to search the waters of the complex plane.

### The problem in a nutshell

A screen shot from an online survey conducted by the world's largest scientific publisher:

It's not that I don't do any of those things that merit their own checkbox, but "Other" here feels like banishment to the Island of Misfit Toys.

### The usual suspects

My "Breaking bad" entry posted below shows an animation of an oscillatory flow that emerges during time integration of a vertical Bridgman system. This oscillation occurs when the temperature gradient is destabilizing near the top of the ampoule. Formally this oscillation is known as a limit cycle, and it occurs when an eigenvalue of the system has positive real part and non-zero imaginary part.

To study this situation further, I wheeled out our newly restored eigensolver and ran some tests. We are solving the generalized eigenproblem, and the mass matrix is indeterminate because of incompressibility, so we need to use a shift-and-invert strategy to find any useful eigenvalues. The shift is vital to the outcome of the method because it determines which eigenvalues are emphasized in the low order projection of the system onto the Arnoldi vectors.

I started by computing ten leading eigenvalues using zero shift, which are shown below left. The real part is the growth rate of the mode, and the imaginary part is its frequency. All modes have negative real part and zero imaginary part, indicating non-oscillating modes that decay with time. On the right below, results obtained using a complex shift of (0.1, 0.1i) are shown. This shift finds a complex mode with positive growth rate and frequency near 0.1 radian/s. The conjugate of this mode is found by using the conjugate shift (0.1, -0.1i). The x-axis has been expanded by a factor of sixty to show the unstable complex pair. Consequently all the decaying real modes are collapsed under a single symbol close to the origin in this plot.

For these results I instructed ARPACK to seek eigenvalues with the largest real part. Strictly speaking this means largest real part *less than the real component of the shift*. Used this way ARPACK will not detect this complex mode unless the real part of the shift is larger than the real part of the eigenvalue. The imaginary part of the shift must also be close to the imaginary part of the eigenvalue. Put another way, we need to hunt for these things if we want to find them. More on this later. For now let's consider what these results are telling us about the system.

If we consider decay time and oscillation period in seconds, essentially the inverse of the quantities plotted here, the slowest decaying real mode has a decay time of 8957 seconds (about 2.5 hours), whereas the unstable complex mode has a growth time of 21.2 seconds and a period of 55.7 seconds. This period is nearly matches the 53 second limit cycle in the animation, which was obtained by direct time integration of the nonlinear equations. Of course the mode does not grow indefinitely; nonlinear effects stabilize its amplitude. I am satisfied the unstable complex eigenmode identified in this linear analysis explains the temporal behavior exhibited by the nonlinear equations.

The slowest decaying real mode is associated with latent heat generated by motion of the growth interface. We can remove this mode by arbitrarily suppressing latent heat generation. The eigenvalues computed using zero shift are shown below left. The effect of this change is to reduce the decay time of the slowest decaying real mode from 8957 seconds to 68 seconds, which is the time scale of thermal transport. Changing the shift to (0.1, ±0.1i) finds the same unstable mode found above, unaffected by suppressing the latent heat.

We can also arbitrarily knock out contributions to the mass matrix, for example those due to the flow equations. In physical terms this implies the flow is held constant while the temperature field is perturbed. The unstable complex mode vanishes when we do this. If we knock out the energy contributions instead, leaving only the flow contributions, the unstable complex mode remains, though its period is shifted to 35 seconds. From this I conclude the instability is primarily hydrodynamic in nature, but strongly modulated by energy transport.

Next let's consider the behavior of the system when the equations are integrated holding the ampoule stationary to represent a true steady state. Below left shows what happens when integration starts from an unperturbed initial state. There is no change in the solution over the duration of the integration, which was much longer than shown in the plot. It might take a very long time for the instability to grow from numerical noise. Below right shows what happens when the initial state is perturbed by a multiple of the leading eigenmode. The solution trajectory immediately begins oscillating at the period of the mode.

Below left shows what happens when integration starts from an unperturbed initial state with the ampoule translating at 1 mm/hour. The simulation is now inherently time-dependent, essentially a batch process, but interface motion is slow enough to expect quasi-steady flow at each time step. Nevertheless, the slight deviation from a true steady state is enough to seed the instability quickly. Below right shows that perturbing the initial state by the leading eigenmode sets off the instability immediately.

The period is unaffected by interface motion over the five minutes of time integration plotted here, but over a longer period of time the ampoule moves into a more stable thermal environment in the lower part of the furnace, allowing the solution to stabilize. The plot of time step size shown in my "Breaking bad" entry below indicates that this happens after about an hour of growth. I've plotted the amplitude of the velocity versus time below to show how the oscillatory mode fades away after which the system exhibits quasi-steady behavior.

These integrations were performed using the trapezoid rule with a step size of 0.25 seconds. Here are some additional calculations I've done to ensure this step size is small enough to accurately capture the period and amplitude of the limit cycle.

When the flow stabilizes we can test whether the leading mode has crossed the imaginary axis back into the left half plane. The growth rate and period plotted below predict that the solution stabilizes after about 56 minutes, with a period of 43 seconds. The actual period obtained from the time integration, shown below right by the open circles, is 41.5 seconds at the time of stabilization. Strictly speaking the linear analysis is not valid when applied to a non-stationary solution, but the oscillation is small at 56 minutes, and has nearly vanished at 60 minutes, where the predicted period agrees nearly perfectly with the measured period.

A curvature of 2300 K/m^{2} in the parabolic temperature profile was used in these studies. We can adjust the curvature to find the point at which the initial steady state becomes unstable. The plots below show how the growth rate and period of the leading eigenmode vary as the curvature is increased from 2260 to 2330. The leading eigenmode crosses the imaginary axis at curvature equal to 2283, past which its growth rate is positive. Its period is 43.2 seconds at the crossing, rises to nearly 60 seconds, then falls slightly before solution convergence is lost entirely at curvature equal to 2332.

Beyond a curvature of 2332 only time dependent solutions exist. The "Sui generis" entry further down this page shows an animation of a strongly time dependent flow that I computed using a more sharply curved profile with curvature equal to 3000. Because no steady state exists for this profile, it was necessary to integrate the equations starting from a quiescent flow. Within a few minutes of simulated time a quasi-periodic flow emerges, but with much shorter period than here, around 12 seconds.

If we want to avoid a time dependent flow we should avoid a destabilizing thermal gradient anywhere in the melt, but what this means quantitatively depends strongly on the size of the system and the thermophysical properties of the material grown in it. The good news is we can model it using Cats2D.

### Can't stop the signal

Everything goes somewhere, and I go everywhere.

### Breaking bad

I've been putting together a study on mass transport in vertical Bridgman growth as a springboard to my forthcoming ACRT work (here at Atelier Cats2D we walk first, then run). During this study I made some interesting observations about integrating an unsteady vertical Bridgman problem that I share here.

These calculations were done using the parabolic "roll-off" type temperature profile discussed in my article on the rotating baffle vertical Bridgman method. On the left below I show two examples of this profile. The more gently curved red profile is the subject of this entry. A steady state solution computed using this profile is shown to the right. No steady state exists for the more sharply curved blue profile, which has a destabilizing gradient near the top of the melt. I will return to the blue profile later.

Below I show streamline plots after 10, 60, and 120 hours of simulated growth started from this solution. The thermophysical properties are those of cadmium telluride in a quartz ampoule of 5 cm diameter.

The results shown here were computed using a fixed time step of 60 seconds. The integration converged at every time step without any obvious growth of discretization error. I had every reason to believe these were good calculations. But here at Cats2D we sleep with one eye open, if we sleep at all, so I tried repeating the integration with a fixed time step of 30 seconds. After just a few minutes of simulated growth the computation failed to converge when the quasi-steady flow began exhibiting time-dependent behavior on a time scale much shorter than the time step. It appears that a large time step allows the integration to “step over” some fast dynamics that happen in the early part of growth.

To explore this further I repeated the integration using variable step size with error control. The plot below shows how step size adjusts to capture the flow dynamics. We learn that step size must be reduced to around 2 seconds to capture the time-dependent flow without losing convergence. But we also need to be careful about capping step size. Initializing step size to 1 second, but allowing it to grow without bound, results in the upper curve in the figure. Step size grows large enough fast enough to step over the early dynamics. Capping it at 60 seconds or less, shown by the two lower curves, is necessary to avoid doing so.

It appears the initial condition is unstable to perturbations even though it is a converged steady state. Within a few hundred time steps the flow evolves into a limit cycle with a period of 53 seconds, the topology of which is animated below.

These observations indicate that a Hopf bifurcation has occurred, which is characterized by a complex conjugate pair of eigenvalues with positive real part in the solution space.

Stay tuned to this frequency to learn more about this interesting topic.

### Honey badger

Looking over my recent entries I see that I've been neglecting Goodwin lately. It was especially thoughtless of me to not save him a slice of cherry pie. So I've put together a brief entry here to thank him for something useful he did recently.

Goodwin's areas of knowledge — knot tying, unusual tools, machines with lots of parts — are a bit offbeat. But also note the boring, utilitarian, work-like nature of these activities. Contrast that to the normal, recreational nature of my interests — classical music (go Minnesota Orchestra!), duplicate bridge, and interior decorating.

I was treasure hunting in my email archives recently when I came across this curiosity from Goodwin:

*To honor Jackson Pollock, you should rename your variables:*

if( Splat == 0 ) /* Vertical */ { Dribble = ... Drizzle = ... }

This art reference is clever and totally unexpected. But his knowledge is so idiosyncratic you never can be sure what he might or might not know. The other day he told me about an obscure 18th century English landscape architect named Lancelot "Capability" Brown. I told him this reminded me of another wonderful name, Increase Mather, to which Goodwin drolly replied that Capability was a *real* historical figure.

Well, Increase Mather was also a real historical figure, whether Goodwin believes it or not. A leading Puritan minister and political heavyweight in the colonies, Increase was also president of Harvard, and father to Cotton Mather of Salem Witch Trials fame. He wasn't just some fictional dude in a tricorn hat from Johnny Tremain.

But I digress. My point is to thank Goodwin for increasing the capability of Cats2D by repairing and modernizing its interface to ARPACK. This has allowed us to start wondering again whether and how eigen-analysis might be used in CFD to say something interesting.

Here are streamlines of the six slowest decaying eigenmodes in the lid driven cavity (Re = 1) computed by ARPACK. The second and third modes in the top row are the real and imaginary parts of complex conjugates, hence their similar appearance. The rest of the modes shown here have strictly real eigenvalues.

In principle we should be able to build a good approximation to the flow from a linear combination of these modes. Judging from their appearances I think we could do it using a small number of them. Whether it is worth doing so is unclear, however, since we already know the full solution to the flow. Computing these eigenmodes requires a factorized Jacobian, and one or more back substitutions per mode, making it easily as costly as computing the flow solution. I don't feel that I've learned anything from seeing the form of the modes that I don't already know from the flow solution, but this is a simple flow. Perhaps I will feel differently after studying some more complicated situations like melt crystal growth.

Arnoldi's method performs well for this problem, converging easily to the leading modes using a small Krylov space and without any need to shift the Jacobian before inverting it. We should expect this problem to be easy, however, since things are predominantly happening at one time and one length scale. Crystal growth and other multiphysics problems usually exhibit several time and length scales in their behavior.

Arnoldi's method does not converge so easily for such problems. A much larger vector space is needed to converge a small number of eigenvectors. To find the most relevant modes requires a certain amount of trial and error with the shift, and a complex shift is essential to find oscillatory modes associated with each time scale of the problem. Carefully exploring the space of eigenvalues requires some effort, and mostly tells us things we already know, but I'm finding it can also be thought provoking. I will report more soon on what I've learned from applying eigen-analysis to the vertical Bridgman problem.

### The Double R

I've decided this site needs more food photos. You can't go wrong with a good food picture, like this delicious cherry pie with its crisscross crust.

I showed this photo to Goodwin and he immediately started explaining to me how he wanted his portion served, with ice cream and all that. The pie had already been eaten the week before. Some people just love to jump to conclusions.

### Jackie Blue

In a recent post I made a casual comparison between my bicolor scale and Moreland's "warm-cold" scale. Here is my bicolor scale:

Here is my version of Moreland's warm-cold scale, obtained by fitting RGB data from Table 6 in his paper, and now a regular feature of Cats2D:

Earlier when I described these scales as the same type I was focused on their general red-blue aspect, without regard to how these hues vary through the middle of these scales. When I went about putting the warm-cold scale into Cats2D, I realized that it differed from my bicolor scale in a fundamental way. In the language commonly used to describe these things, my bicolor scale is a *diverging* scale, whereas the warm-cold scale is a *sequential* scale.

In this sense the warm-cold scale has more in common with the rainbow scale than it does with my diverging bicolor scale. The rainbow scale is a continuous progression from blue to green to red, whereas the warm-cold scale is a continuous progression from blue to *gray* to red (approximately). The bicolor scale, on the other hand, has a discontinuous change in hue at the midpoint, a jump from blue to red. Another way to look at it: Moreland replaces that ugly murky green with gray, whereas I just yank it out of there altogether.

Compare these lid-driven cavity stream function plots illustrated by these three color scales:

The rainbow scale shown on the left exhibits the usual nonlinear gradient artifacts in the yellow and cyan regions. The warm-cold scale shown in the center is obviously better in this regard, though the gray transition between hues remains somewhat suggestive of it. The diverging scale shown on the right suggests a steep gradient, almost a step change, making it wholly unsuitable for a plot of this type.

Besides avoiding gradient artifacts, the warm-cold scale also benefits from better color separation at the ends of the scale. This can be seen in the upper left corner of the lid driven cavity, where flow separation occurs at the wall, and also in the center of the main vortex. Adjacent color levels are much easier to distinguish in these areas in the warm-cold scale than the other scales. This can also be seen in the plots of temperature and flow topology in the vertical Bridgman system shown below. The difference between these color scales jumps out in these plots.

As with the rainbow scale, the warm-cold scale is worst at the center of the scale. This makes it problematic for situations in which a datum level such as the melt interface on the left, or separation streamlines on the right, is highly significant. The bicolor scale selectively emphasizes the datum level at the expense of color separation at the ends of the scale.

### I can see clearly now

Crystals grown from the melt are never chemically homogeneous. No matter how carefully they are grown, elemental crystals such as silicon and germanium always have impurities, and usually have intentional dopants as well. The same is true of stochiometric compounds and salts. Alloys often crystallize in a stochiometry that differs from the melt due to thermodynamic partitioning, and are usually doped as well. So we are always interested in mass transport in these systems. I show results here from my new tool that allows me to reconstruct segregation in the crystal without the need to include solid state mass transport in the model. I compare this zero diffusion approximation to non-zero values typical of small, fast diffusing dopants.

Here are some plots of concentration of a generic dopant within a crystal after 140 hours of growth at 1 mm/hr in a 2 inch diameter ampoule. I've used a partition coefficient of 0.5, which means that half of the dopant is rejected at the interface, causing it to accumulate in the melt. The leftmost plot shows the concentration recorded at the growth interface during solidification, and also the interface position recorded at different times.* This distribution will be preserved in the crystal if solid-state diffusion is sufficiently small, which is often assumed to be the case.

The other plots show the distribution computed at three non-zero diffusion coefficients: 10^{-12}, 10^{-11}, and 10^{-10} m^{2}/s. The concentration distribution is practically identical for D < 10^{-10}, which is a very fast solid state diffusion coefficient. So it should rarely be necessary to compute mass transport in the crystal for typical materials. This is good news because the numerics of solid-state diffusion are very stiff whenever concentration in the melt at the growth interface is forced out of equilibrium with concentration in the crystal, which is essentially the purpose of ACRT and other convective methods used to stir the melt.

It is notable that the interface shape does not match the concentration isocontours in any of the plots. Crystal growers often assume that these isocontours, which can be made visible in real crystals by experimental techniques, reveal the shape of the interface during growth. Obviously this assumption can be very misleading and should not be relied upon in these systems.

* The bottommost part of the crystal is not shown in the leftmost plot because it is already solid at the start of time integration and not part of the growth record.

### Leaving a mark

It's why we publish.

### Paint the sky with stars

I'm just having a little fun here. Mouse over this image to watch a line of 2043 particles flung into the heavens by a driven cavity flow. Color indicates velocity, which I've made parabolic rather than constant on the bottom surface to give a nicer picture. As always, it is based on a scrupulously accurate physical simulation.

Paths were integrated for 500 dimensionless time units, which measures the number of times the line wraps around the vortex center. The appearance at times of a reversal in direction of rotation is a stroboscopic effect caused by interaction of frame rate with motion of the particles.

The particles appear to grow disordered quickly except near the center of the vortex, but occasionally ordered features suddenly emerge. These displays of local order continue to occur after hundreds of time units of mixing. In fact the appearance of widespread disorder is deceiving. Close examination reveals that ordered lines of particles move slowly along the boundaries long after mixing has begun. Eventually these lines are sucked into the faster moving central area of the flow where they are bent around into the sort of chevron pattern observed in my earlier study on particle mixing that I discuss in the "Magic carpet ride" entry further down this page.

This is a steady flow in which particles simply circle around closed streamlines, but it is easy to watch this animation and conclude there is some sort of underlying periodic nature to the flow causing these structures to appear. But the temporal behavior observed here is an artifact of the discrete spatial distribution of particles initially released into the flow. It is especially acute because the particles were seeded in a highly inhomogeneous way, packed along a single line like that.

Seeding particles on a uniform grid throughout the flow reduces but does not eliminate this kind of artifact. Massive seeding, such that particles fill the entire picture initially, brings about another artifact, the appearance of areas in the flow that appear devoid of particles after some time of mixing. These can be seen in results of my earlier studies, for example this picture from the "Lost in the hours" entry below:

The accumulation of white space in places seems odd, but it is easy to overinterpret what we are seeing. These particles are infinitesimal objects to which we have given artificial form as visible dots to make this picture. In reality, at least as represented by this calculation, they have no mass and occupy no space. If represented precisely, this picture would be *entirely* devoid of visible particles. Because these dots cover up white space they do not occupy, and cover up each other too, white space is not conserved in these pictures, though our intuition would prefer it that way.

Perceptual difficulties seem to abound in flow visualization, lurking in spots we don't expect to find them.

### Visualizing spatial information effectively

Many possibilities exist by which to visualize spatial information generated by a CFD code such as Cats2D. Flow, for instance, can be displayed as a velocity vector plot, a streamline plot, a particle tracing, line plots on cross-sections, or inferentially by plots of its primitive scalar components (velocity and pressure) or derived quantities (vorticity, strain rate, etc.). Elsewhere at this web site I have shown several different ways velocity vectors can be presented, and how particle tracing can be used to reveal mixing patterns and display residence time in a flow.

Vector and contour plots suffer by using arrows and lines, which occupy space in the plot, to represent point quantities. Vectors can only be distinguished at low densities and are impractical for flows in which velocity varies much across the span of the domain, else the arrows will be too huge to display in some parts and too small to see in other parts. Streamline contours are better, yielding more information in less space than vectors, but also have practical limits on the density of lines that can be displayed effectively in a plot. Both kinds of plots often fail to reveal important details about the structure of the flow.

Streamline contours plotted at uniform intervals in stream function encode two of the most basic descriptors of a flow, which are path and speed. Path is indicated by the contour lines themselves, a perfect visual cue. Speed is inverse to the distance between contours, such that fast parts of the flow have a high density of contours and slow parts have a low density. This too is an intuitive cue, though not quite so strong. Something is missing, however, which is direction, namely whether fluid moves "forward" or "backward" on an open path, or recirculates clockwise or counterclockwise on a closed path.

Direction is usually indicated by arrowheads placed on the contours. This approach has the unappealing characteristic that information displayed by the arrowheads is non-local, in that our eye must seek this symbol on each contour to infer its direction. Cats2D provides these arrowheads as an option for streamline plots, but I usually leave them off because they are distracting and unaesthetic, especially during animation.

The bicolor scale allows us to use hue as a local indicator of direction that is particularly intuitive for closed flows with recirculations. I've used this scale to animate time-dependent buoyant flow in a Bridgman system shown in the entry below titled "Sui generis". Here I show a similar animation, this one of time-dependent flow forced by accelerated crucible rotation.

There is a lot going on here. The animation, 100 seconds long, is played in real time. The ampoule spins in one direction for 50 seconds, then reverses its direction and spins for 50 seconds in the opposite direction. This can be seen when the hue changes from red to blue at the boundary in the azimuthal velocity W shown on the right. We can also see how remnants of flow in the previous direction persist near the center for some tens of seconds after rotation is reversed. The Ekman time scale applies here.

The stream function shown on the left depends only on the acceleration of W and not its sign, so the final half of the animation appears periodic with the first half. Regions shaded red flow upward in the center and downward at the walls, which customarily happens just above the growth interface in Bridgman systems due to release of latent heat. The animation shows how acceleration of the crucible forces fluid outward along the growth interface against this tendency to rise at the center. The change of hue from red to blue signals this change in direction of vortex recirculation.

ACRT is challenging to understand, especially when it occurs in competition with thermal buoyancy. Few works have been published on this subject, and most of these are severely flawed in one respect or another. I don't know of any work of much quality or utility that has been published since my first study appeared in 2000. High quality computations are just the start to analyzing this fascinating and complex system. The key to unlocking its story lies in the effective visualization of its temporal behavior.

To do that I've been honing old capabilities and developing new ones in Cats2D. Using my new bicolor scale to indicate flow direction and demarcate lines of flow separation, as shown above, reveals the skeleton of a flow, so to speak. There is a clarity to these animations missing from conventional streamline or vector plots that I'm certain will be useful to understanding this system. But there is much else about this problem to be found in other representations of the data.

I've made some animations of particle mixing that are equally insightful in a different way. I will be unveiling some of these soon. And I've just completed a nice addition to the code that logs the temporal and spatial variation of chemical composition of the crystal formed at the growth interface. This allows me to reconstruct segregation in the crystal without the need to include solid state mass transport in the model, which is often unnecessary on physical grounds because the diffusion coefficient is so small, and also troublesome on numerical grounds for the same reason. More on this later too.

### Darmok and Jalad at Tanagra

Goodwin and me again. I forget who is who in this photograph. I'm showing it here to acknowledge that he had a genuinely good idea last week, to set up an I/O pipeline using Python to create a mask for seeding particle paths from an image.

Of course I had to actually implement the thing, just like I've done for every other vaguely conceived blue sky idea to come out of his enormous brain. He was so proud of this idea he felt it necessary to send emails nearly every hour to needle, cajole, insult or beg me to get it done.

But he was right and this new tool is a ton of fun. I've been rotating some new animations on the gateway page of this web site that were created using a mask to spell out Cats2D, like this one (mouse over the image to see it go):

It is worth emphasizing that all of the animations shown here are extraordinarily accurate calculations of particle motion in a flow, and all of them are played in forward time. So when you see these animations "unswirl" into a perfect image, the particle paths were first integrated from the unmixed state to the mixed state before reversing the flow and creating the animation frames. I wrote about this earlier in "The sands of time" entry found further down this page. There I showed some static images of the initial and final state, but now you can see how it unfolds during the integration.

I can imagine some practical applications of this tool, but I am far more excited to report that it has allowed me to create the perfect marriage of art and physics.*

* But see what Goodwin had to say in this inane but typical email exchange we had about it.

### Crystal blue persuasion

The rainbow color scale based on the visible light spectrum dominates scientific visualization despite its widely known flaws. Borland and Taylor describe the rainbow scale as "actively misleading" for the gradient artifacts that appear in the red to green, and green to blue transitions. The same field shown in gray scale and rainbow scale looks quite different because of these artifacts:

The field plotted here is linear, but the narrow bands of yellow and cyan in these transitions bias us to perceive large gradients in the field where none exist. This perception is encouraged by a poor separation of shades in the broad bands of the principal colors, red, green and blue. The greens are particularly murky and indistinct.

Most commercial simulation and visualization codes use some sort of modified rainbow scale to combat these artifacts. The scale can be made better by applying a nonlinear transform that compresses the greens and widens the yellow and cyan transitions. Here is something I cooked up by hand. It is far from perfect, but it is definitely an improvement:

Nevertheless, broad regions of red, green, and blue persist over which the variation of hue is slow and subtle. We can sharpen our perception of change by displaying discrete color levels instead of a continuously varying hue:

The gray scale also looks better broken into levels:

Whether it is instinctual, or conditioned by our frequent exposure to rainbow color temperature maps shown by weather forecasters, the rainbow scale works pretty well for displaying temperature contours. We commonly describe blue and green as "cool" colors and yellow and red as "warm" colors. Nevertheless, the perceptual ordering of hues is weak, and for many purposes the rainbow scale is simply wrong. Tufte shows a particularly ghastly example of its use on page 77 of his book *Visual Explanations*, where he converts a classic geographic map from its original shades of brown for land elevation and blue for ocean depth to a rainbow scale.

Whenever we are interested in displaying information relative to a datum, a two-hue diverging color scale is likely to work much better. I show here a simple bicolor scale obtained by varying saturation while holding hue and value constant in the HSV color model employed by Cats2D.

This is the scale I have been using lately to color streamline plots such as the animation of flow in a vertical Bridgman system shown in the entry just below. Compare that animation to this hideous attempt colored by the rainbow scale.

A more sophisticated approach to designing this type of color scale based on human perception of color separation is described by Moreland. His scale, copied below, doesn't look much different than mine, though I concede its superiority.

Moreland's scale clearly benefits by varying the color value as well as saturation to achieve a more gray appearance towards the middle of the scale and deeper saturation at its ends. It looks so good I just might adopt it for Cats2D. Nothing but the best for my baby!

Let's take a look at a situation where we might choose either the rainbow or bicolor scale, a plot of the temperature distribution in a Bridgman ampoule. The plot on the left below shows rainbow and bicolor scales applied to the full range of temperature in the ampoule, which is 81.5 degrees C. For the plot on the right I've reduced the range to 50 degrees C to increase the contrast in color levels near the middle of the ampoule, at the expense of saturating the color levels at the hot and cold ends.

The rainbow scale is good at emphasizing the extremes of temperature, which is helpful if we are looking for hot and cold spots. Intermediate temperatures are poorly distinguished, however, making it problematic to identify the growth interface. In fact there would be no obvious way to see the interface except that I've switched contour shading from black to white there. In contrast, the bicolor scale strongly emphasizes the interface, at the expense of muting the perception of hot and cold extremes. Reducing the color range, as shown in the plot on the right, works well with the bicolor scale, but with the rainbow scale it does little to emphasize contrast near the interface while having a decidedly unpleasant impact elsewhere.

Inertia is tremendous. Though an internet search reveals a wealth of information on this problem and what to do about it, for example this nice round-up by Eddins at MathWorks, use of the rainbow color scale remains ubiquitous. This is yet another area in scientific computing lacking development of meaningful norms for quality and significance.

### Sui generis

Time-dependent buoyant flow in a vertical Bridgman system, played at normal speed (left) and 1/10th speed (right). The flow is nearly periodic with a 12 second interval. In the regions shaded red, fluid rises near the center and sinks near the walls. In the regions shaded blue, fluid sinks near the center and rises nears the walls.

The flow evolves slowly over time as the growth interface moves upward, but only 3 microns over one cycle, so it takes many cycles to notice any evolution in the flow or change in the interval.

I'm going to have a lot more to say about this work soon, concerning both the physics of this flow and the visual presentation of it I've shown here.

### Jack Straw from Wichita

It's about how we choose to look at things.

This cropped image of an unstable rim coater flow reminds me of a desert sunrise over distant hills:

This one seems more like a sunset:

It's a Georgia O'Keefe landscape in Peter Max colors.

### Bridge over troubled water

I have a low aptitude for foreign languages. I worked in restaurant kitchens staffed with Spanish speakers where I learned the names of ingredients and dishes, and a few words of a saltier nature. I lived a short time in Germany, where I managed to learn what I call restaurant menu and street sign German. What these situations had in a common was a narrow vocabulary of a specific nature, used without grammar or composition to speak of.

In regard to spoken language, we use terms such as fluent, proficient, and conversational to rate skill. There is no precise meaning to these ratings, but they are usefully descriptive because we all know from personal experience in our native language what it means to be fluent. Someone who is merely proficient at a foreign language might well overestimate their ability and call themselves fluent, but no serious job seeker would try to pass off my rudimentary skills in Spanish or German as fluency or even proficiency. It isn't credible.

The situation with computer languages, unfortunately, seems totally unrestrained by comparison. Everyone is a programmer, nowadays, according to their resume. It is commonplace for graduate students and post-docs to list one or more of the major programming languages, C, C++, and Fortran 90 under skills, when their knowledge of these languages is hardly better than my restaurant menu German. Worse yet, the situation extends to methods commonly used to solve engineering problems. I've seen plenty of students claim experience with finite element methods, when the most they've done is run a few canned problems they don't understand on commercial software they don't understand.

Finding a good candidate to develop a complex CFD application such as Cats2D is like looking for a needle in a haystack. I've read dozens of résumés from candidates who appear nominally qualified to do this kind of work. In reality none of them could get anywhere at all without intensive training. Their graduate school education is remedial. This situation is harmful to physics-based computing.

### Never get out of the boat

Goodwin and I have a lot in common besides Cats2D. He's smart. I'm smart. He sails. I sail. And so on. We also have submarines in common. I like submarine movies, and he served on an actual submarine. It probably looked something like this:

I think we've both been feeling like this lately.

### Get off my quarterdeck

First, if you haven't seen my new article on interface shape control, please read the entry below about it. I will be adding a few more things soon to round out the story.

I'm down the rabbit hole again, this time working feverishly on a project that will be revealed in due course. When it's done I've got a few things on the agenda for Cats2D. I still need to write a few words about my new heat velocity plots, and I want to test some diverging color scales for streamline plots. The rainbow color scale, used by Cats2D, is widely criticized for representing continuous data in ways that are misleading. Goodwin installed an HSV color model in Cats2D long ago that is flexible, but I don't understand how to use it (come to think of it, this characterization applies to many things he's put into the code!). Cats2D images shouldn't just be pretty, they should maximize the information conveyed relative to the perceptual effort required to absorb it.

I also plan to return soon to modeling one of my favorite problems, the ACRT method in melt crystal growth. This problem is tough to simulate, complicated to analyze, and difficult to visualize. Simulating ACRT under realistic conditions demands fine resolution of temporal and spatial scales, particularly near the growth interface. To usefully address the influence of ACRT on morphological stability it will be necessary to accurately resolve mass transport in a thin boundary layer at the growth interface, the shape of which is intimately coupled to mass composition by the phase diagram. It's a frightening, nonlinear quagmire. In other words, it's perfect grist for Cats2D, and I'm looking forward to sinking my teeth into it.

Without any new technical accomplishments to crow about, I'm going to fill out this space with some blather about films. From its humble beginnings as a schlocky horror flick for teenagers, Jaws is now regarded as one of the greatest movies ever made. It was the highest grossing movie of all time until Star Wars came along. These summer blockbusters are still fun, and were innovative in their day, but neither would have endured were it not for the rich selection of metaphors they've given us. Here our anti-hero Quint eyes his reel for signs of movement:

A colleague of mine who happens to be a film buff once opined that everything in life could be explained by analogy to something from the Coen Brothers' films. He couldn't have made a better choice among filmmakers. But here on the Developments page I'm sticking with Jaws.

### Interface shape control in vertical Bridgman growth

As promised, I have posted the first in a series of articles I intend to write on various aspects of convective heat and mass transfer in crystal growth systems. In this article I analyze the characteristic features of Bridgman crystal growth that cause the growth interface to adopt a concave shape, then I propose some modifications intended to invert the interface to a more desirable convex shape. The modified system is a novel configuration based on a low conductivity baffle inserted into the melt close to the growth interface, illustrated here next to the conventional system. The baffle may also be rotated.

These modifications bring the system closer to the ideal scenario of an interface that is mildly convex everywhere. However, a locally concave region persists near the ampoule wall. In the article I explore how these modifications work, and discuss some ideas on how to eliminate that pesky concave region at the wall. Please read on...

### From feedstock to finished product

This is what chemical engineers do. Add value.

The zucchini weighs over 4 pounds. My wife sent me out to pick up a pair of these mutants left by someone on their front step. No street number was necessary, I could spot them half a block away. They baked just fine, although I had to remove some large seeds. Nine loaves will last us through the winter holidays.

That's my Rose Levy Beranbaum kitchen scale to the left of the zucchini. I don't usually endorse commercial products, but I am going to make an exception here. Order one now, and while you're at it order all her cookbooks too. The photographs alone are worth it.

*Addendum: *After seeing this entry Goodwin told me he wanted a loaf of zucchini bread brought over (I'm flattered), told me to "get to work" (he should have told me to "quit loafing"), and explained to me that I had actually destroyed value by mispricing my labor in the construction of these loaves (he uses a spreadsheet to determine this).

### Mystery achievement

Strictly speaking this isn't related to Cats2D, but it shows an early interest I took in flow visualization. Last summer I got a bulk email from the ACS announcing this vote on which T-shirt to print. Surprise! I made the picture shown on these shirts back in 1999 using Ensight, a commercial flow visualizer.

This flow visualization was made from a simulation of a KTP solution crystal growth system reported in Vartak, Kwon, Yeckel, and Derby (1999) Journal of Crystal Growth, vol. 210, pp. 704-718. The picture was selected for the cover of Crystal Growth & Design when it was launched in 2003, and was displayed there until 2005. A good picture can get you somewhere, it seems.

The journal's editorial staff actually wanted to use a different picture initially, one of those shown below, but copyright to these images was already owned by a different publisher.

These flow visualizations of a KDP solution crystal growth system are from Yeckel, Zhou, Dennis, and Derby (1998) Journal of Crystal Growth, vol. 191, pp. 206-224. I made them 20 years ago and I think they're still beautiful. Unfortunately the resolution is not very good, making them poorly suited to enlargement. Back then it took some serious hardware to make pictures like these, and image resolution was limited to a few megapixels.

These early 3D flow simulations were computed on the Connection Machine CM-5, using data parallel codes that were made obsolete by the disappearance of this short-lived computing paradigm. Doing anything with parallel computing back then took forever, and things haven't gotten much better today despite a lot of hype.

I have an MPI-based parallel code that I've named Cats3D in honor of Cats2D, that I wrote from scratch in 2007 and 2008. It lacks many of the bells and whistles of Cats2D, but it is quite capable nonetheless, or at least it would be if it had an effective linear equation solver. I'm way too committed to Cats2D to revisit Cats3D any time soon, but maybe in a few years Goodwin will slow down enough at his real job to meet this calling. Then we'll have to rename it T-Rex, because it's going to rule the earth.

### Touch of grey

In a few respects you might say that I'm an early adopter. I started using computers in 1979, back in the days of main frames and punch cards. By 1985 I was using a Cray supercomputer (running an early version of Fidap, no less!), and by 1994 I had my first web site. In 1999 I was the first person on my block to have a wireless home network, and in 2001 I bought a first generation iPod within weeks of its introduction. But I must be one of the last people alive to get a mobile phone. I've owned a flip phone for a few years, but I've barely used it. The edifice has cracked, however, since my wife bought me an inexpensive smart phone last week. I just sent my first text message. *Ever*.

Two things about me reliably surprise my friends and acquaintances. One is that I've never owned a mobile phone. The other is that I've never been to Yosemite.* *It doesn't matter that I've been to more than fifty national parks and monuments. *You've never been to Yosemite?!?* I guess everyone has their favorite place. I don't know if Goodwin has been to Yosemite or not. He's been to the North Pole, however. In a boat. Top that.

### Not fade away

I'm posting this picture in case anyone thinks I've forgotten about crystal growth.

Those are heat flux pathlines on the left side of the plot, expressed in terms of a *heat velocity* that I will write about soon.

### Turn on, tune in, drop out

Sixteen months have passed since I launched this developments page. I've accomplished much in that time with rarely a dull moment, which is a good thing. Unfortunately, the single page blogging approach that I've adopted here no longer is adequate to organize material at the rate I've been generating it. The page is already weighed down by more than 120 plots generated by Cats2D, despite the selective approach I've taken in presenting short vignettes intended to emphasize code development over scientific aspects of the work. There is plenty of science here, to be sure, but so far I have avoided in-depth studies of particular physical systems that interest me.

I would like to change that by occasionally posting longer pieces approximately the scope of a traditional journal article. I promise plenty of scientific depth, but in a more casual format free of the boilerplate typical of journal articles. I've grown disillusioned with the dissemination model of traditional science journals for a few reasons. I enjoy the flexibility and freedom this web site gives me to present my results on a multimedia platform. It allows unlimited use of high quality color images, animated visualizations, slide presentations and so on. Material can be edited at any time to fix errors, clarify points, add new results, and anything else that makes the story better. Best of all, it can be organized and reorganized any way I see fit.

This sort of continuous refinement can't be done in traditional publishing. If you want to add something new, you need to publish something new. Scientists end up churning out a lot of short articles based on a few results that are supplementary to work they've already published. The standard requirements of a journal article demand repetition of general background, methods, literature, etc. Some of this stuff can be off loaded by citing earlier works, but we must always say something in each of these areas for the article to stand alone as a published work. The flip side is that many things get left out of the story. The tools used by many of us, be it a sophisticated lab instrument or a code like Cats2D, have grown too complicated in this day and age to properly document in a journal article. We mostly deal with it by citing references from the literature, but sometimes the relevant information has never been published, and other times it is prohibitive to find and comb through it all.

I acknowledge that science publishers are aware of concerns like these, and now accommodate online supplementary material to articles, including multimedia. But the rigid formatting of the article itself, the process by which it is published, and the loss of copyright to the publisher, all deter me from continuing down that path. Since most of us work in one or a few topic areas for many years, gathering chapter after chapter as our knowledge of a subject grows, why not frame the story as a cohesive book? Surely many a great author has rewritten the first chapter after the last has been finished. Why lock us into each chapter of our story as it first becomes bound by publication? If we ever want to retell that part of the story better, we must write another chapter, sort of the same but sort of different, presumably without violating copyright. It can get awkward.

This dissemination model based on transferring copyright in our scientific works has created an oxymoron, the notion of self-plagiarism. Actual plagiarism is simple to understand: stealing someone else's words and ideas without crediting them for it. But how is it possible to steal our own words? In fact we can't, until we assign copyright to the publisher. We still own the idea but give up the right to publish it again using the same words to describe it. This might seem like no big deal to your typical freelance writer moving from assignment to assignment, but in science we study and write about our pet ideas for decades. If we want to modify one of our articles because we've gained a better perspective on it, it is forbidden to re-publish it because of copyright issues. There are legal ways to collate published materials by written permissions, but most of the time we craft a different article that repeats many of the same things in not quite the same way.

When we do this we are not just skirting copyright issues, we are also skirting the concern that we are duplicating publication of our research in order to gain professional advantage. This concern is taken very seriously. Publishing the same paper in two different journals (or even twice in the same journal!) is a significant ethical violation and has gotten people banned from journals. There is, in fact, a fair bit of self-plagiarism out there, because it is hard to avoid, the way things are done. Much of it is trivial (the methods section you've rewritten so many times), but a pattern of substantial self-plagiarism can raise questions, particularly if it appears calculated to pump your publication count.

So self-plagiarism is this odd umbrella that covers several different issues, none of which has anything to do with actual plagiarism. I think we should simply drop this term and call things by their true name, be it copyright violation against the publisher, professional misconduct against the employer, violation of ethical customs for scientists, or whatever. The act itself, copying our own words, only becomes an issue because there are pathologies in how we do things in science.

Transferring copyright is one of them. I realize the journals allow significant leeway in how authors may continue to share their work after publication, but the restrictions are onerous, nevertheless. Take for example the exclusive right to publish demanded by the journals. Syndicated columnists can publish their work in a hundred newspapers, reaching a vast audience, if they are popular and the papers are willing to pay for it. I'm not suggesting we encourage this in science, I know it's not customary, but there isn't anything unethical about it, either. Frankly, I see it as a business model concern for the science publishers more than anything else.

Likewise, duplicative publishing is not a problem until we strongly tie professional advancement to publishing traditional journal articles, to such an extent that metrics are often employed (basically counting). Everyone knows this encourages all sorts of minor evils, such as gift authorship, the "minimum publishing unit" approach to publishing, frivolous citations to pump impact factors, slice and dice, etc. Writing a "different" paper that isn't too different doesn't bug anyone too much anyway, because it's an easy publication. All these things lead to dilution of intellectual content in the literature, to the detriment of us all.

A useful analogy to writing code can be made here. Software applications of a non-trivial nature either get better or die. If you stop updating a code if will become obsolete as the demands of the computing world change around it. Often we do something called "refactoring", which simply means rewriting a piece of code with the intent to improve it somehow. Think of replacing the failing appliances in your kitchen and installing new cabinets. It is still a kitchen, but it works better, it looks better, and it makes you happier. We should treat our research story this way, as something to continually improve rather than merely accumulating more of it.

The best way to accomplish this is to get off the conventional publishing model and take control of the content ourselves. The internet makes this feasible. The personal research web site is common, but rarely carried out with much ambition. It should become the central expression of one's research, one stop shopping for anyone who follows your work. It should form the primary basis by which your work is evaluated for professional advancement. Cut out the fat and put in the content. Make it a multimedia freak show if possible. Issues with self-plagiarism and manipulation of publishing metrics simply would not exist under this dissemination model. A career based on duplication, repetition, and frequent padding will become much harder to disguise when everything important you've done is organized in one place.

I'm not saying we should stop publishing in the traditional format, but we should reduce how often we do it. Reserve it primarily for mature work that has already been well vetted in its internet ecosystem, and recent work worthy of a breaking news treatment. Periodicals could routinely include a roundup of hot research web sites selected by editors, with commentary added by invited peers. Standards and methods would evolve to enable bibliographic citation of material found at research web sites by persistent URL or DOI. A symbiotic relationship would emerge.

Stayed tuned.

### One by four by nine

Cats2D represents a monolithic approach to solving coupled problems in multiphysics. By this it is meant that all field equations, e.g. velocity, pressure, temperature, concentration, magnetic and charge fields, etc., are solved within Cats2D in a strongly coupled manner. This defies a modern trend towards splitting field equations between two or more specialty codes that are operated in a loosely coupled manner, which is referred to as a partitioned approach.

Writing a monolithic code requires a lot of effort, but I think it's worth it. Otherwise we give up too many things. A complete Jacobian, for example. This costs us the ability to do many useful things, or at least to do them well. Convergence of nonlinear problems will almost always be slower without it. Continuation and parameter tracking methods, which are vital to exploring the solution spaces of nonlinear problems effectively, will suffer degraded performance. Accurate and stable time integration can also be more difficult to attain. A loosely coupled code is likely to be an order of magnitude slower than a strongly coupled monolithic code that uses a complete Jacobian (see my paper on Newton-like block coupling methods for more on this topic).

### Lost in the hours

I'm intrigued by the possibility of using pathline integration to study strongly convected mass transport. The mixing studies I presented in "The sands of time" entry below correspond to a limiting case of mass Peclet number equal to infinity. In the absence of molecular diffusion there is no mass transfer between neighboring points, so each point traced by the pathline integration maintains its original concentration. There is plenty of mixing so that points of low and high concentration are brought close together, but this mixing does not extend to length scales smaller than the thickness of the striated layers formed by convective transport along the streamlines of the flow. Furthermore, points within closed recirculations do not mix with points outside.

Unless these points are actual particles with properties distinct from the liquid, molecular diffusion will eventually cause these layers to mix. Yet convective flows often yield Peclet numbers greater than 10^{5} for transport of ordinary molecules in liquid, and 10^{8} or larger for macromolecules. We expect the limiting behavior of pathline transport without diffusion to accurately describe mixing over some initial period of time in such systems. Continuum mass transport at large Peclet numbers results both in stiff time integration requiring small time steps, and sharply defined spatial oscillations requiring a fine mesh. As these requirements can become prohibitive, it would be useful to know under what circumstances it is feasible to apply pathline integration in lieu of continuum modeling.

To study this issue I compare mixing computed by pathline integration to continuum mass transport computed from a convective-diffusion model. Mixing starts from a linear concentration distribution shown on the left below. The model flow, shown on the right, is a steady state Stokes flow driven by motion of the upper and lower surfaces in opposite directions. This flow features two co-rotating recirculations enclosed by a saddle streamline in the form of a figure eight.

After two dimensionless time units, the continuum model with Peclet = 10^{5} shown below on the left strongly resembles pathline integration shown on the right. I've seeded paths at 56,481 node positions, matching the number of concentration unknowns in the continuum model. Diffusion has reduced the difference between global maximum and minimum concentration by 5% compared to the starting concentration field. The pathline integration never reduces this measure because each point retains its initial concentration, and mixing only occurs at larger length scales by the formation of ever thinner spiral arms.

The next results I show follow five time units of mixing. One time unit corresponds approximately to one recirculation of the vortexes, and each recirculation adds one spiral arm inside each vortex, and also outside them. In the pathline integration there clearly are five or so arms inside and outside of the vortexes. Spiral arms persist in the continuum model as well, but not so many as the pathline integration, nor so clearly defined. The maximum concentration difference has fallen by 10% in the continuum model. This is roughly the point at which I judge the models to start deviating significantly from each other.

The difference in models is much starker after ten time units, shown next. The pathline integration exhibits the expected number of spiral arms, but the continuum model shows only a small hint of them. The maximum concentration has fallen by 30%. The formation of the spiral arms progressively reduces the length scale of local concentration variations. The time needed to diffuse away these variations is reduced by the square of this length scale, namely a factor of one hundred for the ten arms formed here.

Initially the diffusion length is the entire span of the cavity and the rate of diffusion is small compared to convective mixing. After some number of spiral arms have been formed, the length scale over which concentration varies is small enough for diffusion to become comparatively large. We can estimate this crossover point by comparing the half width of an arm, which is inversely proportional to the integration time, to the diffusion length, which is proportional to the square root of time:

The term in parentheses in the denominator on the left is the rate of production of spiral arms, so the left side represents the cavity width divided by eight times the number of arms formed in time *t*. Two arms are formed each time unit, one inside and one outside the vortexes, and each of these introduces a concentration layer on both sides of the cavity, introducing an additional factor of four to the divisor in this expression. Scaling time by the rate at which arms are generated gives the dimensionless form:

Rearranging gives us an estimate of the dimensionless integration time, measured in number of spiral arms, at which concentration variations diffuse away faster than the arms are formed:

The simulations above show that the spiral arms have largely diffused away by the time ten of them have been formed, a bit faster than the estimate of thirteen predicted here. We expect a somewhat pessimistic prediction from this estimate anyway, because it takes no account of diffusion that happens *before* the predicted crossover time is reached. The exact number is less important than the scaling in terms of Peclet number, which is a one third power. For each factor of ten increase in Peclet, the validity of pathline integration extends further in mixing time by approximately a factor of two (ten raised to the one third power).

I have tried computing the continuum model at Pe = 10^{6}, but the mesh used here, 100x140 elements, fails to give a satisfactory result because of spurious spatial oscillations of concentration. I could probably capture the solution at this Peclet number reasonably well by doubling the mesh density, but things are getting expensive by then. Pathline integration makes a reasonable alternative for the first ten or so spiral arms formed at this Peclet number. These numbers suggest that we can reasonably begin to apply the zero diffusion limit around the same time we lose the ability to accurately resolve the continuum model at reasonable cost.

### Is that a Chihuly?

It's often useful to play around, but not everything we try works as expected. Here are some of Goodwin's crazier ideas:

Cats2D Scientific and Cats2D Artworks divisions were both unimpressed. The CEO had this to say:

*Well it proves one thing, Mr. Goodwin. It proves that you wealthy college boys don't have the education enough to admit when you're wrong.*

### The sands of time

In my previous entry on pathline integration, I showed how a massive number of particles seeded on a uniform lattice become spatially disordered over time in a recirculating cavity flow. Long integration times were studied, up to 100 time units where one time unit corresponds approximately to one recirculation around the vortex center. For such results to be meaningful, integration of the pathlines must be extremely accurate. Indeed they are, as I demonstrate at the end of this entry. First, let's look at some results of mixing a stratified liquid. I've colored 14,000 particles red or blue by initial position and subjected them to the cavity flow shown below, which features two co-rotating vortexes driven by translation of the upper and lower boundaries in opposite directions.

Below is shown the result after 50 time units of mixing. The saddle streamline makes things stand out a bit more clearly. Particles inside of the lobes of this streamline are trapped there and do not cross the horizontal midline. Outside these lobes there is mixing, but striae rich in blue or red persist. The number of striae increases linearly with time.

The same calculation is repeated using black and white particles plotted on a background colored by the stream function. This gives the mesmerizing plots shown below. The striae show up more clearly in these plots.

Here is a slightly different problem, mixing of a linear concentration profile with color ranging continuously from red to blue. If I was asked to come up with one word to describe the plots shown below, it would be *festive*. I'm not sure why, but that's what comes to mind.

It's time for a show stopper and I'm going to pull it off with a dull looking result. This is a bona fide *Goodwin challenge*, something really nasty he thought up while eating his breakfast of charcoal and sawdust one morning. He pointed out that particle mixing ought to be reversible in these flows. He didn't make a big deal out of it, he just said it quietly and left it hanging there. The gauntlet thrown down, I knew I had to take one of these mixing results at long integration time and integrate it back to the starting time after reversing the flow. The particles are seeded on a uniform lattice and even a slight jiggling of their positions is very obvious to the eye. Unwinding the mixing shown above without leaving a jumbled mess seemed like hitting a hole-in-one in a gusty breeze, fourteen thousand times in a row. But we do not ask for quarter here, so I ran this test with my fingers crossed. On the left below we see a handful of defects in the final lattice. Overlaying the saddle streamline shows that the defects occur only near the walls or on the saddle streamline, where particles lose integration accuracy turning the sharp corner near the saddle stagnation point.

Everything considered, I'm calling this test one huge success. The average distance traveled by a particle in this calculation is 1,333 distance units, nearly 100 times the long span of the cavity. The particles traverse an average of 15,700 elements at five time steps per element before returning to their starting positions.

A cultural reference seems apropos, sailing related naturally...

*It has to be more than 100 sea miles and he brings us up on his tail. That's seamanship, Mr. Goodwin. My God, that's seamanship. *

I've demonstrated fine accuracy but I haven't commented yet on timings for this tool. The pathline computation shown above required about 25 CPU minutes on my modest laptop, which seems reasonable for the large number of particles and long duration of mixing. Computing a few recirculations requires under a minute, and short paths of length equal to the element size require only 0.1 seconds. I bring up this last measure because I am thinking of a situation in which the particle trajectories are coupled to a time-dependent flow. The time step for integrating the flow is likely to be determined by a Courant condition based on element size. The numbers suggest that coupling pathline integration to a time-dependent flow will increase CPU time only slightly compared to solving the flow alone. For reference, 2.5 seconds was needed to compute the steady state Stokes flow computed here, and we can anticipate a comparable time would be needed per time step to integrate a time-dependent flow on this same mesh. This compares to a few tenths of a second for the pathline integration. Even quadrupling the pathlines to one per node is affordable to compute under these terms. This opens up some interesting possibilities.

### What the heck is this thing?

Look at what Goodwin has cooked up. It is a fixed roller submerged in a liquid film that is carried from left to right on a moving substrate, essentially the exit region of a slot coater. The animation shows what happens when a time-dependent rotation is applied to the roller that oscillates in direction from clockwise to counter-clockwise.

If you're wondering whether this odd configuration has any technological applications, who cares? It's cool, and that's what matters here at Cats2D World Headquarters. Surf's up.

### It's full of stars!

The title of this entry is a reference to one of those famous movie lines not actually spoken in the movie to which it's attributed, such as "Play it again, Sam" and the movie *Casablanca*. In this case it's *2001, A Space Odyssey*, where the line appears in the book but not the movie: “The thing’s hollow—it goes on forever—and—oh my God!—it’s full of stars!” Everyone associates this line with the movie because it marks such a dramatic scene in the book. Anyway, it's the first thing I thought of when I started seeing pictures like this coming out of the new pathline integration:

Cool. Salvador Dali meets Jackson Pollock. So I started making pictures. The gateway to this web site features some of my favorites. I hope you enjoy them. We're gonna do tee shirts next. Little Goodwin says there is a market for stuff like this among the college kids.

This picture was made by integrating pathlines for a massive number of particles and coloring them by distance traveled. What makes this picture odd is the apparent absence of pathlines for many of the particles. This happens because the particles are drawn only after all the pathlines are drawn. Pathlines are drawn over one another many times and the final background color depends on the order in which the pathlines are drawn. A red particle on a blue background happens when blue pathlines of other, slow moving particles are drawn over the pathline of the red particle. Drawn in a different order, the background might be red instead, and there are regions of red particles on a red background in the picture where this has happened.

There is nothing unphysical per se about this plot, but these algorithm-dependent artifacts make it too confusing to be much use for analyzing the flow. Reordering things doesn't help. If each particle was drawn at the same time as its pathline, we could not see any distinct particles in this picture, just a smooth variation of color that still depends on the order in which the pathlines are drawn. We've simply thrown too much paint at the canvas. Ordinarily it is best to omit drawing pathlines when massively seeding the flow with particles, as was done in the previous entry, for example. Unless we're trying to make fine art, that is.

### Tasty change of state

Goodwins brought us dessert the other evening. It looked like this:

It's a sphere of ice cream on a glazed raised torus. The form is wonderful. It brings together the two simplest three-dimensional non-homeomorphic surfaces. And there is process as well. You heat the torus first. Then you add the sphere. It's rotationally symmetric. You could almost model this thing with Cats2D.

### Magic carpet ride

This is the next entry in my series on pathline integration. The previous entry showed how a dense line of particles in an inertialess cavity flow spreads out to form a spiral arrangement that eventually transitions to a disordered distribution. When strong inertia is added to the flow, the initial line of particles persists and rotates about the center in the solid body core, but viscous effects of wall friction and flow separation in the corners eventually bring about the same general transition to disorder.

Here I take another look at this phenomenon by seeding a massive number of particles uniformly in the flow and integrating the pathlines for a long time. Shown below are the particles colored by distance traveled after 1 time unit (below left) and 10 time units (below right). Each time unit represents one recirculation of the solid body core of this flow. The appearance of order is readily evident within the primary vortex except at the outer edge of the red band, where a Chevron-like particle distribution has formed, outside of which there is disorder. As time proceeds this transition to disorder moves inward.

Shown below are the particles colored by distance traveled after 40 time units (below left) and 100 time units (below right). The order-disorder transition has progressed nearly to the center. These calculations were done on a 300x300 mesh of uniform elements, but I have done them on a much coarser 100x100 mesh and obtained very similar results. So I think this phenomenon is physical and not numerical in origin, but I defer to the mixing experts on what it actually means.

Ten thousand pathlines were integrated for up to one hundred recirculations each to make the lower right plot. When I started working on this feature I never imagined I would push it this hard, but I knew that Goodwin would. If I was to say to him "try a hundred paths" I guarantee he would try a hundred thousand. Then he would complain that smoke was coming out of his computer. Like the time he made a plot with 2,000 contour levels then complained that something minor went wrong. It's like whack-a-mole trying to prevent this stuff, so I just make it work right to begin with.

### Your brain on Cats2D

Stare into this picture for awhile.

Okay, you can return to whatever it was you were doing.

### More mesh generation

Yeah, I know, boring. I'll comment on that aspect of mesh generation at the end of this entry. In my previous entry I studied discretizing a semicircular domain in two different ways, both of which give rise to an irregular point at which three grid lines intersect. Here I look at the situation of a semicircular region embedded in a larger domain, which has this type of point too, but also introduces an irregular point at which five grid lines intersect.

The goal of these circle-in-a-square meshes is to solve problems in which the circle is a free boundary, perhaps a gas bubble in a liquid, or a liquid inclusion in a solid, that deforms under physics. We prefer a method that is robust to deformation without requiring a lot of user intervention. The tests performed here impose no internal geometric controls on the mesh region boundaries, allowing the initial circular shape to deform subject to the behavior of the EMG equations. Results shown below compare the initial mesh computed by transfinite interpolation, on the left, with the mesh computed by EMG without any residual swapping, on the the right:

The results show the same tendency to "square off" the circle at the merge boundaries observed in the previous entry for the semicircular domain, except that here the effect extends to the boundary of the circle, which is no longer constrained to retain a circular shape. At the points where five grid lines intersect, the equations give a reasonable if ugly result. The points at which three grid lines intersect present the same difficulties as before.

If residuals are swapped along the merge boundaries, but no swapping is done at the merge corners, the result shown below is acceptable, but just as before this approach is prone to failure if the mesh density changes much between regions:

The initial circular shape of the boundary is approximately maintained, which is a big improvement over the previous result, but a slight corner has bulged outwards where the circle intersects the merge boundary. This corner goes away if the residuals at the merge corners are swapped without suppressing any contributions, as seen below, but now one of the elements "folds" making the mapping singular:

This is the same problem observed in the previous entry and the preferred treatment identified there works here as well, which is to suppress one of the residual contributions at the merge corners. Doing this gives a good result that is robust to large changes in element size between regions:

Throughout these tests the points at which five grid lines intersect behave less problematically than the points at which three grid lines intersect. All three treatments give an acceptable result at these points, although the most pleasing result is produced by the preferred treatment.

Here is a test of another domain shape featuring irregular points of this type, showing an initial mesh by transfinite interpolation (below left), a mesh computed without residual swapping at the merge boundaries (below middle), and a mesh computed using residual swapping along the merge boundaries (below right):

Again, a better mesh is obtained when residuals are swapped at the merge boundaries. I think I have managed to cook up a simple scheme to handle these awkward mesh topologies without needing to impose ad hoc geometric constraints on the internal mesh boundaries. The results aren't particularly pretty but are usable. The next step is to test these schemes on a physical problem with an interface of some sort.

Putting myself through all this makes me question the utility of elliptic mesh generation. EMG is something we got locked into about 25 years ago when we started this project, and we never looked back. We have tended towards studying problems of fairly simple geometry, and we have accepted the occasional burden of manually laying out the mesh regions to suit the domain at hand. But EMG forces us to use structured grids, which greatly constrains the manner and degree to which the mesh can deform from its initial shape without becoming singular. If I could roll back the clock knowing what I do now, I would favor an unstructured method of some type, but I find the idea of retrofitting today's Cats2D too daunting to consider.

### Starship Cats2D, first boarding call...

In an entry below I mention how doing things right can lead to unforeseen payoffs. After getting the pathline integration algorithm working, I started playing around with different ways to seed them and color them and so on. Then I made a plot like the one shown below, except that it looked uneven and ragged. That did not sit right with me. The locus formed by the little balls at the ends of the pathlines should be visually smooth in a plot like this. This alerted me to two minor mistakes in the algorithm that I fixed before making this exemplary plot:

I was integrating the pathlines accurately, but I hadn't been careful enough about the termination of integration. In a variable time stepping algorithm, it is necessary to intentionally adjust the final step size to precisely hit the prescribed integration limit. I wasn't making this adjustment, so each path terminated at a slightly different time. I fixed this flaw, but the picture still did not look this nice. Then I remembered that the algorithm does not draw a path segment for every time step, only when a segment reaches a critical length linked to plot resolution, and it wasn't drawing the final partial segment. Only when I fixed this did the plot look so good as shown here.

These adjustments were small, just slightly moving the balls, but large enough to have a critical effect on visual perception. It would not have been evident to make these adjustments if the pathline integration was not highly accurate. The point I am trying to make here is that *any* careless or inaccurate aspect of the algorithm messes up the appearance of this picture. Integrating, transferring, terminating, and drawing must all be done precisely to achieve this visual perception.

Here is an example of what can be done with all this accuracy. I've computed 1000 pathlines seeded along a vertical line between the bottom of the lid-driven cavity and the center of the primary vortex. Below left shows the initial positions of the particles, and below right shows the positions of the particles after 5 dimensionless time units of integration. Were velocity to increase proportional to radius, as in solid body rotation, the particles would remain in a straight line that rotates about the vortex center. For this Stokes flow, velocity falls off with radius except towards the lid, causing the line to spread out such that particles form a spiral instead. The picture is perhaps deceiving, suggesting that individual particles follow a spiral path towards the center when in fact they remain on the streamlines in this steady flow.

The number of spiral arms increases with integration time. Below left shows the situation after 10 time units, and it appears to have approximately twice as many arms as the case shown above on the right, indicating a linear relationship between number of arms and integration time. Another interesting feature emerges, and becomes quite obvious after 100 time units, shown below right, which is a transition between order and disorder that begins in the outer region and gradually moves towards the center.

At first I thought this might be caused by accumulation of integration error, since the integration times are quite long. Tests showed, however, that this observation was quite insensitive to mesh refinement or time step size control. This is particularly true of the rate of progression of disorder towards the cavity center, strongly suggesting that the calculations have accurately captured the time scale of a physical rather than numerical phenomenon. So I think the effect is real. This is probably familiar ground to people who study vessel mixing.

In this next series I show the behavior in a strongly convected cavity flow computed at Re = 5000. This flow has a large central region nearly in solid body rotation, which tends to maintain coherence of the line of particles. Below left shows the situation after 0.5 time units and below right after 1 time units (1 time unit represents a single revolution of the core). The line of particles in the center slightly lags solid body rotation. In the outer part of the cavity viscous effects retard the flow and the particles spread out, eventually forming spiral arms and then disorder, but at a much slower rate.

Below left shows the distribution after 100 time units and below right after 400 time units. The region of order is still large after hundreds of rotations of the core.

Very interesting and it was just something I stumbled into when testing the pathline algorithm. More on this later when I push the algorithm to its limits. Until then, bon voyage.

### Meet the Cats2D team

That's me on the left and Goodwin on the right.

*"This shark, swallow you whole. Little shakin', little tenderizin', an' down you go...**it's not gonna be pleasant. I'll find him for three, but I'll catch him, and kill him, for ten.*"

### I don't like having to explain myself twice

Here is something you need to know about Goodwin, and I'm sure that Other Goodwin and Little Goodwin will agree with me about this: *Sometimes he just does not listen*. I'm talking here about the critical point finding algorithm I developed back in '92 for CFAL. There are some nice papers about this method in the documentation page of this web site. Goodwin, always a copycat, added a half wrong, partly crippled version of this algorithm to a proto version of Cats2D. I've hated it for 25 years, so I've finally fixed it.

So how does it work, and how did Goodwin screw it up? The CFAL version that I wrote used the downhill simplex method, a multi-dimensional minimization algorithm, to find the extrema of the stream function, which mark vortex centers. These are so-called critical points of the stream function, which are topological features of the flow. An incompressible flow in two dimensions has two other types of critical points, which are separation points at boundaries, and internal stagnation points, which occur at saddle points in the stream function. These can be found by seeking minima in the velocity that are also zeroes, i.e. stagnation points. By comparing these stagnation points to the vortex centers already found, we can identify the saddle points by inference.

Once these critical points are known, we can make a nice picture of the topology of a flow showing only the critical points and critical streamlines, free of extraneous detail. Here is an example of a cavity flow with left and right ends driven in opposite directions:

This approach is robust in the sense that it reliably finds and correctly identifies all the critical points across many orders of magnitude of velocity, on meshes both coarse and fine. The streamline values of the saddle and the vortexes it encloses are 4.2 X 10^{-12} and 6.8 X 10^{-12} respectively, more than ten orders of magnitude weaker than flow in the outermost vortexes. But the method is slow because the minimizations do not use any gradient information to accelerate convergence and must be repeated many times to find all the minima. We'll get to that in a minute. About now you're thinking "big deal, driven cavity blah blah, what good is this picture?" so we'll take a brief interlude to look at a critical point plot in the slot coater:

Here's the deal: without a very fine algorithm to find that interior stagnation point under the slot exit, you are likely to miss that region of trapped fluid behind the saddle. Ordinary streamline plots often fail to reveal much detail in regions of slow flow. Scriven's group studied this problem for several years without realizing this closed recirculation was present. Residence time in this region is set by the slow diffusion characteristic of liquids, and therefore is very long. This might pose problems for coating liquids that can "age" in some manner, so we would like to know of its existence.

Goodwin sent me an email one day saying he had installed a critical point plot capability in the code. I naively assumed he followed my paper on the subject, but instead of minimization he used Newton's method to directly find zeroes of the velocity field and the stream function gradient. This works great at identifying stagnation points, but terrible at finding zeroes in the stream function gradient, making it impossible to reliably determine which points are vortex centers and which are saddles.

This approach gives an unsatisfactory result because the finite element representation of the stream function has discontinuous derivatives at the element boundaries, allowing zeroes to "hide" between the elements (this is what I call a *know your methods *moment). Goodwin, unable to stay away from bad company, did something even worse, which is to use the Hessian of the stream function to identify the type of critical point. This requires evaluating *second* derivatives of the stream function, with further degradation of accuracy. It gives the wrong answer about half the time.

To make things better, I have started with Goodwin's code to find the zeroes of the velocity field, which does so fast and reliably. Then I determine the type of critical point by applying a local patch test to the stream function around each critical point to explore whether it is an extremum or a saddle. It is important to explore neighboring elements when a critical point lies on or near an element boundary, and careful numerics are needed to detect shallow extrema in slow moving areas of the flow. A particularly demanding example is this flow in a long narrow cavity with top and bottom sides driven in opposite directions:

The method is quite robust, and much faster than the CFAL version I wrote 25 years ago. The difference in stream function value between the center vortex and the saddles flanking it equals 1.5 X 10^{-11}, ten orders of magnitude smaller than the stream function itself.

Below I show an excellent demonstration of this tool. Goodwin, bored as usual, threw together this problem at the dining room table one Saturday and sent it to me. Depicted below is the cross section of a cylinder eccentrically placed and rotating within a deeply corrugated outer wall. A fluid fills the space between them. Goodwin calls this problem *Lisa Simpson's Hair*. Naturally he was writing to complain about some very minor flaw in the code, and it was apparent he was planning to make a mountain out of a molehill until I fixed it. But when I got that out of the way, it occurred to me that this problem would be a great test of the critical point finding algorithm.

Was it ever. I think there are 45 critical points in Lisa Simpson's Hair, and it did not look nearly this good at first. I used this uber-test problem to eliminate one weakness after another in the algorithm. The structure of the flow is rather intriguing, too. Saddle streamlines are nested seven deep, enclosing fifteen co-rotating vortex centers. Here are the three innermost saddle streamlines, enclosing seven vortex centers:

I'm not sure which is more amazing, these plots or Goodwin making a successful cultural reference.

### How to integrate pathlines of massless particles

I was having coffee at my morning hangout one day, thinking about a bunch of stuff I should finish (my mesh generation studies, for example), when I got pulled into the stream and carried away by a problem that has been troubling me for a long time. Why do time-integrated pathlines always look so terrible? Streamlines of a steady incompressible flow in two dimensions should either close on themselves or intersect inflow/outflow boundaries. They should not do things like spiral, which indicates failure to conserve mass, the grossest error one can commit when solving conservation laws (it is the zeroth moment, after all, and if you can't get that right, why bother).

Yet every time-integrated pathline plot I've seen of a closed, steady flow shows streamlines that spiral, or terminate unexpectedly at walls. I knew I could do better on pathlines if I put some care into it, particularly since the methods of Cats2D do a fine job of conserving mass. I started out with the modest goal of integrating a closed pathline once around the loop accurately enough that it looks like a streamline contour. Here is what I achieved. Streamline contours are shown on the left and integrated pathlines on the right for the driven cavity at Re = 1000. How did I do?

Pretty well, I think. There is a slight gap in the outermost pathline at the bottom, but otherwise the paths close nicely. The spacing is different because pathlines are computed at equal spatial intervals whereas contours are computed at equal intervals in stream function.

Whenever doing anything with the finite element method, you should always ask yourself "What is the Galerkin way?" This means doing everything consistent with your basis functions and the finite element paradigm. In this case the Galerkin way is to map the pathline equations to the parent element and carry out the integration there, rather than on the physical domain. The formalism is discussed in this brief note I've written.

Two major advantages are conferred by this mapping: (i) Adaptive time stepping becomes trivial because the solution is restricted to smooth behavior within the element by the low order basis set. Integration stiffness is taken care of simply by using an appropriate mesh to obtain an accurate solution to the flow. (ii) Element edges are defined by Xi or Eta equal to a constant, much simpler than the general expression for a line in a plane. This makes it easy to accurately determine the intersection of a path with the element edge and transfer its integration to the neighboring element. If this is not done very precisely, error in the path will rapidly accumulate, especially on fine grids.

I've put together an algorithm to compute pathlines based on fourth-order Runge Kutta time stepping within the elements. Initially I prototyped the algorithm with a forward Euler integration, but later tests revealed that RK4 was about five times as efficient at a given accuracy. The algorithm is arranged to slightly undershoot the element edge with the RK4 integration. A final small forward Euler step is taken to precisely reach the edge without round off error. Then the integration is transferred to the next element.

I've managed to do all this with sufficient care that pathlines can be integrated many times around a streamline with very little drift. To show this I've plotted pathlines colored by distance traveled, up to 40 cavity lengths, which is about 20 recirculations for all but a few outer pathlines. Below left shows the result on a 100x100 element mesh. The result is excellent except for a bit of drift in the outermost pathline. Refining the grid to 300x300 elements, shown below right, mostly eliminates this drift. The same thing can be accomplished by grading the elements on the 100x100 grid.

Plotting pathlines in lieu of streamlines doesn't make much sense in practice, although it might sometimes be advantageous to display contours at equal spatial intervals rather than at equal stream function intervals. When I started this work my immediate goal was to demonstrate that pathlines don't need to look terrible. My ultimate goal is to create a tool to animate motion of particles in a flow, which I will get to later. But in the course of this work it has became apparent that pathline integration can be exploited in several different ways to visualize a flow. Soon I will show how the ability to compute such accurate trajectories allows us to do some very interesting things that hadn't occurred to me before.

### Mesh generation is harder than it looks

The devil is in the details, as the saying goes. This is the second entry in my discussion of elliptic mesh generation on irregular topological domains. In the first entry, found down the page, I gave a brief introduction to EMG. Here I consider some common situations that arise when discretizing circular domains.

Shown here are two ways to divide a semicircular domain into quadrilateral regions for the purpose of mesh generation. The red lines show which boundaries in the computational space are merged in the physical space. The first topology shows how the the north sides of regions on the left and right can be merged with the east and west sides of the upper region. The second topology shows how the south sides of regions on the left and right can be merged with east and west sides of the lower region.

These merged boundaries don't pose any problems when the position of the nodes there are fixed by Dirichlet conditions. But this cannot be done for situations in which the domain deforms away from its initial circular shape, as will happen to a drop or bubble subjected to shear or body forces, or to an inclusion of liquid metal in a binary alloy. If we cannot rely on the EMG equations to control movement of these boundaries, we must impose ad hoc geometric constraints on them instead, an awkward situation we prefer to avoid.

Below on the left is shown an initial mesh computed by transfinite interpolation for the first topology illustrated above. Shown in the middle is the mesh computed by EMG without imposing any special treatment along the merged boundary. A close up of the intersection of three mesh regions is shown on the right. This mesh is reasonably good, but obtuse angles have formed in the elements at the intersection that suggest failure will occur should the domain deform significantly.

In this mesh topology, elements along one side of the merge have their computational coordinates Xi and Eta swapped with respect to the elements on the other side. Ordinarily contributions in the Xi computational direction are added to the X residual equation, and contributions to the Eta computational direction are added to the Y residual equation. Along the merge, however, Xi contributions on one side are added to Eta contributions from the other side for the X residual, and vice-versa for the Y-residual. This tends to maintain a square angle between lines of constant Xi and Eta on opposite sides of the merge.

We can improve the situation for this mesh topology by swapping residual contributions along the merge to pair up Xi and Eta contributions correctly. Below on the left is shown the mesh obtained by EMG with residual swapping. The Xi and Eta coordinates along the merge tend to align rather than square off, which gives a more pleasing result than seen above. Three close ups show the merge corner for three different treatments at this node. On the left is shown the preferred treatment, in which one of the residual contributions along the merge is suppressed. If residuals are swapped at this node without suppressing any contributions, one of the elements at the intersection "folds" making the mapping singular, which is shown in the middle close up. To the right is shown what happens if the swapping is not done at all. The result is acceptable here, but tests show that it tends to fail easily if the element size changes much around the merge corner.

We show next the same results repeated for the second topology. Below on the left is shown transfinite interpolation, which behaves identically on either topology. In the middle is shown EMG. The location of the merged boundaries is different in this topology, but the effect of mismatched Xi and Eta residual contributions along it is similar to before.

Swapping residuals along the merge with the same three corner treatments tested before is shown below. The preferred treatment gives a result very similar to the first topology (left close up), but the alternative treatments (middle and right close ups) behave opposite what was observed before, indicative of the lack of reliability of these treatments.

That's it for now. Next time I will introduce a more complicated situation of a circular domain embedded in a larger domain, the mesh for which has another type of irregular point in which five grid lines intersect.

### There is more than one way to look at a flow

This entry is the start of another one of my *doing things right* lecture series. The flow depicted here is a variant on the lid-driven cavity, in which the bottom surface translates to the right and the top surface translates to the left. The box is 2 cm wide and 3.6 cm deep, the fluid properties are those of water, and the translation velocity of the surfaces is 1 cm/s. The Reynolds number is 200, enough inertia to skew the vortex centers away from the vertical center line.

On the left is a typical vector plot, with the background colored by flow velocity. This is an easy way to graphically represent motion of the fluid, but is often inferior to streamlines, shown in the middle plot. It hardly matters, though. The literature abounds with low quality vector plots that are worthless and many people don't seem able to make streamline plots at all. That said, on the right I show you something new to wonder about, another kind of plot that is often done poorly to its discredit. More on this soon when I write about how *doing things right* leads to unforeseen payoffs.

### On a silver platter

Good times. We've just declared an extra beer ration for every man aboard the SS Cats2D. Internally dated tar archive on permanent read-only media.

Pretty soon we're gonna take a stroll down memory lane.

### Cats2D user manuals, past and present

It is to my disappointment that I have not yet begun to write the new Cats2D user manual. The old user manual can be found in the documentation section of this web site. But I have made up a new cover for it, shown here on the left alongside the old cover.

You can see that I've finally gotten up the courage to put Goodwin in his place.

### Coming up for air

I've been down the rabbit hole for a few months working on the mesh generation puzzle, with precious little to show for it yet. But I've been learning new things, and some improvements have come about that I am ready to share. But first I want to go through a quick primer on elliptic mesh generation (EMG).

A few entries down this page I show how a circular domain can be divided into five quadrilateral mesh regions. One of these regions is shown below on the left, divided into a 6x6 array of finite elements. The individual elements are mapped from the physical space (X,Y) to a reference space (Xi,Eta) that we call the parent element on which the equations are discretized. The region as a whole is mapped to a computational domain of parent elements, shown on the right.

Commonly the shape and size of the elements in the physical space is determined by an algebraic interpolation from the boundary discretization, such as transfinite interpolation. But for deforming boundary problems it can be desirable to apply a mapping *from* the computational space Xi,Eta to the physical space X,Y as a means to specify the deformation of elements, which makes it easy to couple mesh deformation with physics.

This can be done by defining the mesh coordinate lines (namely element boundaries) in the physical space as level curves of a pair of differential equations in the computational space. Second order elliptic equations are commonly used, hence the name EMG. The equations used in Cats2D are discussed in Numerical Grid Generation by Thompson, Warsi, and Mastin, and have been used by others, though certain aspects of our usage seem to be novel.

The equations take the form of weighted Laplacians, equivalent to diffusion equations with variable coefficients. A more formal discussion is found in this excerpt on crystal growth modeling (Yeckel and Derby, 2005).

Some examples illustrating the effect of these coefficients are shown below. Uniform coefficients give elements that are uniform in size, and variable coefficients give stretched meshes with small elements where the coefficients are small and vice-versa. Stretching can be varied in each coordinate direction independently.

These coefficients can be computed from the initial mesh regardless of how it was obtained.

The coefficients are also scaled to allow a jump in element density between neighboring regions, which preserves the spacing of the initial mesh, as shown on the left where the EMG mesh matches the initial mesh. If the coefficients are set to unity, the EMG equations will try to equalize the elements in size between the neighboring regions. If the boundary nodes are fixed by Dirichlet conditions, equalization cannot be fully attained and element shapes become distorted near the jump, as shown in the middle. If the nodes are allowed to slide along the boundary, the element size will equalize everywhere, as shown on the right.

If the elements are skewed, transfinite interpolation gives a desirable result (left) whereas EMG with natural boundary conditions over-emphasizes orthogonality (middle). Applying a shearing boundary condition can largely mitigate the problem (right), which is done automatically in Cats2D.

This is all pretty simple stuff until we try to patch together several of these quadrilateral mesh regions to form an irregular topology such as the mesh on a circular domain. I will post a new entry soon showing you some of the things I've been learning about it.

### Barrel of monkeys

Here is a fun time waster suggested by Goodwin. These are steady state simulations of a fixed volume of liquid inside a spinning cylinder, sometimes called the rim coater. The volume fraction is 0.19, rotation is counter-clockwise, and gravity points downward. Properties are approximately those of water: viscosity is 10^{-3} Pa-s, density is 10^{3} kg/m^{3}, and surface tension is 0.1 N/m. I really like round numbers, so I've set gravitational acceleration to 10 m/s^{2}. Cylinders of 1 and 2 mm radius, spinning at 10 cm/s (approximately 105 rpm), are shown first.

On the right hand side the liquid film thickens where it is carried upward against gravity, and on the left hand side the film thins where it accelerates downward with gravity. The effect is slight in the smaller cylinder, for which the Bond number is 0.1 and Reynolds number is 100. Doubling the cylinder radius increases the Bond number to 0.4 and Reynolds number to 200. At this larger size the drag induced by motion of the cylinder is not sufficient to haul all of the liquid to the top of the cylinder, therefore much of the liquid accumulates in a circulating pool near the bottom of the cylinder.

Next shown is a much larger cylinder of 1 cm radius spinning at 10 cm/s (105 rpm) and 40 cm/s (419 rpm). The Bond number is 10, sufficient to trap nearly all the liquid in the pool. The Reynolds number based on cylinder radius is 1000 and 4000 in these cases, large enough to suggest inertial instability. Indeed, at the higher rotation rate, the simple vortex in the pool has begun to spawn a second recirculation, and surface waves have developed where the film enters the pool, seen in the image on the right.

Around 574 rpm the solution becomes unstable at a turning point (saddle-node bifurcation), shown below. The surface waves have become pronounced and the film has nearly pinched off where it enters the pool. An interesting flow structure, a vortex pair connected by a saddle point in the stream function, has emerged in the pool.

Not surprisingly, the turning point is sensitive to mesh refinement. Meshes of 10x1600, 20x1600, and 10x3200 all agree closely at 574 rpm, which I am calling the converged value. A coarse mesh of 20x400 elements predicts 456 rpm, well short of this value, and meshes of 10x800 and 20x800 predict 536 rpm, still appreciably low. But note that what I call a coarse mesh here yields 108,000 unknowns, and the finest meshes have 432,000 unknowns, quite large problems in two dimensions. Fortunately it's Bambi meets Godzilla when Goodwin's solver dispenses a factorization in less than 5 seconds for the largest of them.

These simulations make a lot of sense intuitively. Before I ran them, Goodwin showed me some simulations he had obtained at high values of Reynolds number. His results looked weird because the liquid collects at the top of the cylinder, somehow suspended there against gravity. I'm showing a similar solution below that I computed for a 10 cm radius cylinder spinning at 1.4 m/s (1466 rpm), much larger and faster than anything shown above. I've reduced the surface tension by a factor of five but left the other properties the same, which is similar to a 1 centipoise silicone oil. The Bond number is 5000 and Reynolds number is 1.4 X 10^{5}.

Inertia is sufficient to hoist the liquid to the top and hold it there. Unlike the situation depicted earlier where it collects at the bottom, there is no recirculation in this pool. The liquid decelerates sharply but continues around in its entirety with the spinning cylinder. This solution is a converged steady state, but it is not stable. Time integration shows an unstable evolution of the free surface, easily set off by numerical noise, that manifests as dripping.

Capillarity and inertia both tend to keep the film uniform in thickness and curvature, and predominate in systems that are small or spinning rapidly, respectively. Either of these limits is convenient to obtain a first converged solution, since the position of the free surface is known to be a circle. I've shown here that these limits behave quite differently in terms of flow stability, however. A small system in which capillarity dominates inertia and gravity is always stable, whereas a rapidly spinning system in which inertia dominates capillarity and gravity is likely to be unstable. Completely different solution regimes can be accessed starting from these limits.

Another interesting flow regime pointed out by Goodwin resembles more a classic hydraulic jump induced by inertia than it does a stream entering a pool. To observe a jump I've reduced the volume fraction of liquid in the cylinder by a factor of four. The result shown below is for a 2 centipoise silicone oil in a 1 cm radius cylinder spinning at 0.35 m/s (366 rpm).

The jump observed here remains fixed in the laboratory reference frame, but viewed relative to the spinning cylinder it moves upstream, making it akin to a tidal bore in a river. Whether this phenomenon is closely related to the hydraulic jump of a stream flowing over a level surface is debatable. In the rim coater, gravity causes an adverse pressure gradient that results in run back downstream of the jump, and rotational forces strongly interact with the pressure field as well. There are no analogies to these effects in the classic hydraulic jump of a stream.

### The other Goodwin

Yep, there's more than one. Anyway, the other Goodwin made us dinner last night, in response to which I quote contestant Norman from the The Great British Bake Off: *I'm going to have to up my game.*

### Slot coater, part 2, plus general brag session

Here is a reproduction of a solution from a paper I wrote in 1992 working with Scriven:

In the pressure plot, white contours are positive and black contours are negative. Pressure is negative throughout the gap region and partway up the slot. Vacuum pressure must be applied to the rear free surface to hold it there. The streamline plot shows that a flow recirculation forms under the downstream die face. This recirculation appears whenever the flow rate from the slot is less than the fully developed Couette flow for a channel of width equal to the gap between slot and web. This critical flow rate yields a film thickness half that of the gap, as shown in the entry I've written above on the slot exit region.* Here the flow rate is constrained to yield a film thickness one-fifth of the gap. An adverse pressure gradient is needed in the gap that modifies the Couette flow to a mixed Poiseuille-Couette flow. This is nicely shown using the vector plot tool I report on in one of the entries below:

It would be easy to extend this problem to the non-isothermal case, including mass transfer with evaporation, to simulate drying of the film. This would require adding temperature-dependent transport properties to Cats2D, but this is a straightforward change. It would also be easy to simulate the external environment, for example the effect of blowing on drying of the film. Like I said, Cats2D is a great code for problems of this class. Tensioned web? No problem. Surfactant effects? It's been done. Electrostatic effects? On the way.

*Newtonian fluids only. Long ago I did a neat analysis showing how the critical flow rate is affected by a shear thinning fluid, but I never published it.

### Fitting a square peg into a round hole

Cats2D uses elliptic mesh generation on a structured mesh of quadrilateral regions. The user is responsible for decomposing the domain into these regions. Here is one way to decompose a circle, with a diagram showing the logical arrangement of regions in the computational space. Compass directions must be assigned to each region to define its orientation with respect to the other regions.

This aspect of Cats2D is one of its greatest weaknesses. It requires significant user effort. The best arrangement of regions is not necessarily unique or obvious, and domains with convoluted boundaries or sharp corners can be problematic.

Why am I posting this here? Why am I posting this now? Watch this space to find out.

### Hey, I just found a bug

A scary bug, too. It's been in the code since 2009 when I made a small change in the capillary-traction boundary condition for a problem I was studying at the time. The bug only affects problems in which the free boundary intersects an open flow boundary. Before 2009 the code handled this situation correctly for the planar case, but not the axisymmetric case. I fixed the code, tested the axisymmetric case, and moved on. Whoops! I didn't test the planar case, which I had inadvertently broken. Programming is that way: fix one thing, break another. I never noticed it before because this situation doesn't come up in the crystal growth problems we usually study.

Fast forward seven years. On a whim one morning I decided to put together a simulation of the slot coater. This is a coating flow problem I worked on 25 years ago in Scriven's group. Cats2D is a fantastic code for solving problems of this class, though I haven't used it that way to any great extent over the years. I figured I could throw together a simulation of the exit region of a slot coater in ten minutes or so, Cats2D being such a great code after all.

Ouch. I could not get the problem to converge. Panic set in. What is worse, the solver also started randomly failing because of (*incomprehensible technical reasons no one but Goodwin understands)*, which made it impossible to diagnose the bug in the capillary-traction condition. By evening I was able to work around the solver issue, but I still could not get the problem to converge. It wasn't until the following afternoon that I found and corrected the bug, after which the solution converged in the expected way. Below are shown the streamlines and pressure, in dimensionless units.

Upon emerging from the channel the flow rearranges from a Couette flow into a plug flow. To satisfy mass conservation, a film half the thickness of the channel width is drawn onto the moving substrate. Since the free surface must form a negative curvature to accommodate the reduction in film thickness, a negative pressure must develop in a region near the exit. This tends to reduce, or possibly eliminate altogether, the need to pump the liquid into the channel from the upstream slot. In fact these systems are often run with upstream vacuum, as I will show later when I do a simulation on a larger domain that includes the entire gap and slot region.

About that problem with the solver: I called it a solver bug, and I asked Goodwin if he could fix it. He did not like seeing the words "solver" and "bug" in the same sentence, and let me know it (I'm onboard with expressing pride in one's work, by the way). He told me it was a parallel problem, something about thread safety, semaphores, and mutexes. Then he wrote something so funny I'm still laughing about it: *You take care of it.* Then he gave me a solver update and told me it would erase my disk before crashing next time. Now I'm afraid to run my own code.

### I'm hearing voices again

Goodwin reports that he has achieved a 15% speedup of the solver. He's been bugging me to add a celebratory note about it to the web site. I could tell him to get lost and start his own web site, but I am going to indulge him to keep the peace. So just how big is a 15% speedup? Check the plot below to see for yourself. The blue line is the New Cats2D before the speedup, and the green line is the New Cats2D after the speedup.

Not too impressive, is it? The changes in timing are so small that the poor fellow must rely on statistics to discern meaningful differences in speed. These are the numbers he sent to me for the rightmost data point in the plot above:

circa Aug. 1: (Min, Max, Avg, StdDev, N) = (7.63, 9.15, 8.01, 0.22, 363)

current: (Min, Max, Avg, StdDev, N) = (6.32, 7.81, 6.54, 0.17, 920)

Here is the problem facing Goodwin. When he started with the Old Cats2D, making these improvements was like running across flat ground. But after reaching a factor of ten improvement, achieving further gains is like climbing a vertical wall of loose rock and ice: you're moving slowly, and sometimes you end up going down when you wanted to go up.

I'm thinking that Goodwin needs a new project. I've got just the thing for him: an MPI-based multifrontal solver for distributed memory computing. Then we'll be talking about the New Cats3D.

### Oh, the Thinks you can Think!

I have it on the bookshelf next to Fox in Socks. Goodwin says if you're gonna read Fox in Socks, you've got to read it real fast.

Here is a cool feature of Cats2D, periodic boundary conditions, demonstrated in a novel (and convincing) way. On the left is shown the tilted driven cavity at Re = 2500 computed using the normal geometric domain and no-slip boundary conditions around its boundaries. This solution is attained by traversing two turning points in Reynolds number, something I discuss in detail in the entry further down this page on automated arc length continuation.

On the right is shown the same solution computed on two discontiguous regions that represent left and right halves of the geometric domain. Here I have specified that the left boundary of the left region is periodic with the right boundary of the right region, and the other boundaries have no-slip conditions. This is spatially equivalent to swapping the left and right halves of the domain. This is not, strictly speaking, a periodic problem, but it shares the essential characteristic that the solution variables and their gradients must match at two boundaries that differ only by a spatial translation.

The natural organization of solution data in the finite element method makes it very easy to do this by equivalencing the degrees of freedom along the pair of boundaries. Provided the spatial positions of the nodes along these boundaries differ only by a constant, but otherwise arbitrary, translation, the finite element discretization will automatically satisfy the gradient matching condition for the variable. The outcome is a perfect match to the solution computed using the normal geometric domain.

### The perfect storm

Das Boot is the best submarine film ever made. I own the 2 1/2 hour theatrical release (don't watch), the 3 1/2 hour director's cut (do watch), and the 5 hour version originally broadcast as a six part miniseries on German television (serious fans only). I found this last version in Target not long after it became available in North America. I paid $30 for it, about the same price as two cocktails at a pretentious restaurant. The clerk was stunned that anyone would pay so much for a movie. I tried to explain its rarity and importance without much success.

The novel, by Lothar-Günther Buchheim, is even better. Five hundred pages of claustrophobic hell.

Nautical works are rich in parables and allegory. Books, film, paintings (the Dutch maritime artists are my favorites), and even the carvings that adorn the ships tell wonderful and timeless stories. I'll be making reference to some of them here on the developments page as we march towards our fate together.

### Cats2D in action

Here is a screen shot I put together illustrating some features of Cats2D (click on each element to see a magnified view). The blue terminal window in the center shows the command line menu-based interface, below it are some sample graphics generated by the post-processor for the EFG problem, and to its left is the flowctrl.txt file specifying the equations and boundary conditions of the model. Also shown are two new files I've recently added to the setup, flowcnfg.txt and flowpram.txt, which I focus on here.

At the far right is the flowcnfg.txt file, which shows 82 configuration parameters and their values. The code does not need this file to run; all its parameters can be changed in the interactive menus. The code automatically generates this file at startup if it is missing. The file provides a quick reference to each configuration parameter, showing its default option or value, and its range of options or values.

The code will not overwrite the flowcnfg.txt file at startup if it already exists. Instead the code will read the file and impose its values on the configuration parameters. This gives the user an easy way to customize the startup configuration however desired, simply by editing the values in this file. It is also possible to export the current state of the configuration to this file at any time during an interactive session, or to revert to the default configuration simply by removing and recreating the file.

At the far left is the flowpram.txt file, which shows the types and values of physical parameters active in the current problem. The code treats this file in a manner similar to the configuration file. If missing it will be created; if present it will be read. The idea is the same, to give the user an easy way to initialize the parameters at startup without having to manually enter everything via the menus. The file also provides the user a text-based summary of the parameters that is a bit more convenient to view than the LaTeX formatted tables generated by the code for publication.

### My graphics are better than your graphics

Don't take my word for it. Just have a look around this web site.

I think it is shame that many commercial engineering applications produce mediocre graphics. Even an ordinary laptop computer nowadays has enough power to display high resolution graphics files at reasonable speed. Applications that rely on screen captures to export images simply don't make good images for publishing.

We* believe that image resolution should be unhitched from screen resolution, something we accomplish by using the Adobe Postscript page description language. The device-independent nature of this language and its close relative the PDF format allows us to embed high resolution information into images with low impact on file size or rasterization speed. For example, zooming in closely reveals that vector plots are labelled with the velocity on each arrow. Here is a screen shot of a boundary layer near a wall.

The labels are generally too small to see under normal viewing of the plots, but are available should the desire to obtain quantitative information on the flow ever arise.

Another thing revealed by the high zoom level shown here is the smooth gradation of colors, free of the ugly pixelation found in plots from many commercial applications. Your Cats2D generated image can be printed at poster size and still look good from three feet away.

* the pronoun used here is a sure sign this was Goodwin's idea

### My thoughts on vector plots

Yep, I've got thoughts, and some of them pertain to vector plots.

Thought #1 – I don't like vector plots. So much so that I disabled them in the Cats2D interface on the path to deprecating them altogether. But Goodwin started complaining. Something about vector plots being "standard". And when Goodwin starts complaining about something, it is hard to get him to stop. But you learn to pick your battles. So I set off to fix up the existing vector plot feature in Cats2D, which was pretty terrible, and not coincidently written by Goodwin.

Here is a vector plot that illustrates the evolution of a flow profile at cross sections along a channel. The Reynolds number is 0, 200, and 400 from top to bottom. Cats2D makes it very easy to generate these plots, by the way. I think these plots do an excellent job illustrating the rearrangement of a plug flow as it enters a channel, and how adding inertia alters and slows this rearrangement.

This style of vector plot mostly pertains to simple flows that are nearly unidirectional, such as channel or free stream flow. There is a textbook feel to this representation, and it might be particularly useful for developing educational materials, but I don't think I will use it much in my day to day work. Here is one more example before we move on, flow in a diverging channel, just because it looks cool.

Here is a completely different situation, that of a recirculating flow in the lid-driven cavity at Reynolds number equal to 0 (top images) and 5000 (bottom images). The vector plots look pretty good, but they miss out on some of most interesting aspects of this flow. They show that the central vortex shifts downward to a more rotationally symmetric flow at high Reynolds number (closer to the inviscid limit), and that boundary layers develop near the enclosure walls. The extent of the inviscid core is fairly obvious. But they completely miss the flow separations in the lower corners, and on the upper left wall at Re = 5000. The vector plots are useful, but more as a supplement to a good streamline contour plot than in lieu of one.

I will have more to say soon about vector plots. I might even add Thought #2, because somebody who shall not be named was actually *bothered* that there was only a single enumerated thought in this section.

### Goodwin again

Recently he sent an email in which he described himself as a Border Collie and me as a sheep. Now you might think this is some sort of metaphorical description related to our respective roles in Cats2D, but you would be wrong. I've known this guy for 30 years, and I'm fairly certain he actually believes himself to be a Border Collie.

### More thoughts on vector plots

The examples I've shown above are somewhat contrived in that I've carefully selected cross sections on which to display arrows that are based on the simple structure of these flows. This approach is poorly suited to more general recirculating flows. The usual method in CFD is to distribute arrows more or less uniformly around the flow.

Probably the laziest and commonest way of doing it is to place the arrows at nodes/vertexes/centers of finite elements/volumes. Visually this is a disaster on fine meshes. The plots below show arrows at the element centers on meshes of 30x30, and 100x100, elements of uniform size distribution. Only the coarse mesh gives a reasonable picture of the flow. Selectively reducing the number of arrows on finer meshes to achieve a reasonable density is unappealing, particularly on unstructured or strongly graded meshes.

Next I compare a vector plot to a streamline plot for the EFG problem near the triple phase line. The nodal density in this mesh is about right to make a good vector plot, so I have placed the arrows at the nodes. But the kinks in the streamlines near the corners are a clear signal that the mesh is overly coarse in some areas (with Cats2D, we can rule out poor contouring as an explanation). The vector plot, on the other hand, gives little visual clue of this problem.

A better alternative is to place arrows on a uniform Cartesian grid. This assures a uniform distribution of arrows and uncouples arrow density from details of the discretization. But placement of the arrows is also uncoupled from the general nature of the flow, which can lead to undesirable visual biasing of the flow. The plot shown below appears quite different from the plot above, particularly near the free surface in the upper right corner where the mesh grading is strong. I've also shown the tilted cavity flow to emphasize the effect of a boundary that is misaligned to the Cartesian grid.

In both of these plots the artificial alignment of arrows on straight cross sections is visually distracting in regions of curved streamlines, and near boundaries misaligned to these cross sections. This got me thinking about breaking up the artificial regularity of arrow placement by somehow randomizing the distribution.

To do this I plotted arrows at the centroid of each mesh partition in the nested dissection used by the solver. Since there are several dissection levels, some control over arrow density is possible. Below I show the square cavity flow on a 100x100 element mesh. Also shown is the 10th level of the dissection used to place the arrows. The plot has an obvious visual bias suggesting that the flow meanders under the lid when it is actually quite straight there. The cure might be worse than the disease in this case.

Vector plots just don't show us very much. Only the gross characteristics of a flow can be seen, and these are represented in a way that tends to obscure signs of poor discretization. Also, arrow placement biases our perception of the flow, often to the detriment of our understanding.

On the face of it, I would appear to have wasted my time trying to please Goodwin with nothing to show for it but this Cook's Illustrated style exposition. But hope springs eternal, and it is my hope that vector plots are more useful for illustrating the flow of heat than the flow of mass. Heat flows tend to not have nearly as much fine detail as mass flows (in particular lacking the complex topologies of flow separation), and often we are only interested in the broad characteristics of these heat flows. More on this important topic later.

### Hidden feature alert

Cats2D plots are written as encapsulated postscript files that can be read and altered by any text editor. If you know a little bit of the postscript programming language you can edit these files to effect some desired change, such as adding or removing a label. Goodwin has even written a small script that automatically converts color contours to black contours in a Cats2D contour plot. Here's the tip: we take advantage of the ASCII text format of these files by echoing a complete set of parameter values, and the entire control file, to the file in the form of postscript comments. This can be quite useful whenever you cannot remember exactly which solution is shown in a plot, which happens plenty often in my experience.

### Automated arc length continuation at last

Recently I set out to automate arc length continuation in Cats2D. Manual continuation is tedious and time consuming. The code already has a fantastic algorithm for automated 1st order continuation (we'll ignore who wrote it), but a lousy algorithm for automated arc length continuation (which I am happy to point out was written by Goodwin). But first I needed a good test problem.

Naturally I chose the tilted lid driven cavity, my favorite computational test bed ever. Among its delightful features, it exhibits a series of turning points leading to multiple solutions. The solution branches plotted below were traversed completely without intervention through four turning points by the new scheme. Each symbol represents a steady state solution. The third and fourth turning points, shown in the inset, are particularly challenging to traverse without jumping solution branches.

Goodwin says this figure looks crowded and confusing. I say it looks science-y and beautiful. I *want* to puzzle over it. Just looking at it makes me feel smart. But to help him understand it better, the initial solution at Re = 0 is shown on the left, and the final solution at Re = 5000 is shown on the right. The three flow states shown above the plot coexist at Re = 2500 (left to right, on the blue, red, and green branches). The three flow states shown below the plot coexist at Re = 3250 (left to right, on the green, purple, and orange branches).

The code has always used Keller's method of pseudo-arc length continuation. To automate it I have adapted the same heuristic step size control scheme we use for 1st order continuation. In this simple method*, the step size is adjusted to keep the number of Newton iterations per steady state in a narrow range. If number of iterations is less than or equal to the minimum cutoff (default is 1), the step size is increased by a constant factor (default is 2). If it is greater than or equal to the maximum cutoff (default is 3), the step size is decreased by a constant factor (default is 4). This scheme works quite well applied to change in arc length in place of change in parameter value.

During the course of this work I tested another scheme for adjusting step size based on the change in tangent direction along the parameterized arc length. I thought it might be helpful to detect approaching turning points and preemptively reduce step size in order to minimize the number of convergence failures and recovery steps. It caused more harm than good. Turning points often appear too suddenly for this method of detection to succeed unless very small continuation steps are taken, which defeats the purpose of the test.

If you want to reproduce these calculations using your own code, the cavity vertexes are (0,0), (1,0), (1.3,1), (0.3,1) in the CCW direction. I have used a mesh of 100x100 Q2/P-1 finite elements. It took less than 6 minutes wall clock time to compute the entire arc of 206 solutions on my 3 GHz Core i7 MacBook Pro. Your code will need hours to do these calculations, if it can do them at all.

* The method is obvious, and surely reinvented many times, but we didn't pull it out of thin air. I consider it oral tradition from Scriven's group, but cannot say whether the idea originated there. We have used it on a wide range of challenging nonlinear problems, and it works really well under both full Newton and modified Newton iteration.

### Cats2D on a $5 computer is faster than your code on a $1000 laptop

For those who doubt that Goodwin exists, this entry proves it, because there is no way on earth I would ever do something like this. This is the $5 Raspberry Pi Zero computer, hooked up to a cell phone recharging battery and USB WiFi module. Goodwin has been benchmarking Cats2D on this machine, and on the more powerful Raspberry Pi 2, which costs the princely sum of $35. These machines run a full featured Linux OS.

Well, the truth of this depends on how slow your own solver happens to be, but to give you an idea, Goodwin compared performance of the new solver on a Raspberry Pi 3 (22 seconds per Newton iteration) to the old solver on his MacBook Air (32 seconds per Newton iteration). This is for a 150x150 element lid driven cavity with 248K unknowns, the largest problem that fits on the Raspberry Pi 3. For reference, the new solver computes a Newton iteration in 2.5 seconds on the same MacBook Air.

### Mirror mirror

Cats2D now rotates and mirrors plots if desired. These actions are particularly useful for axisymmetric problems in which the symmetry axis is oriented vertically, true of many crystal growth systems. Internally, Cats2D uses the X-axis as the symmetry axis, resulting in a horizontally oriented display that gives an unnatural view of these systems, like shown here for the EFG system:

Rotating and mirroring can be selected interactively in the new plot transformations menu in the post-processor, or by command in the control file, e.g. plot(rotate(right)) or plot(mirror(vertical)). Plot mirroring can be used in either an automatic mode, in which the same field is displayed on both sides of the symmetry axis, or in a manual mode in which the user chooses a different field to display on each side.

The plot transformations menu also has options to set plot size and scaling, to adjust page cropping, and to move plot origin, each of which can also be set in the flowcnfg.txt file. These controls make it possible to build figures that are composites of several plots.

When I began writing this entry, I didn't think the subject of Goodwin would come up, but then I noticed that the system appears to be slightly narrower in the plot on the right than in the plot on the left. It is an optical illusion. Saying it is an optical illusion is something that I call *a reason*, but that Goodwin calls *an excuse*, as though it were somehow my fault. I can hardly wait for him to suggest that I incorporate some special correction for this effect.

### Coloring contours with a second field

In the graphics options menu it is possible to set contour shading to black, white, or color. Typically black and/or white contours are plotted on a background color image of the field being contoured. Color contours, on the other hand, are best plotted on a black or white background. Examples of these styles can be seen here. The color contours in these examples are shaded by the field being contoured. Recently I've added the option to shade contours of one field by the color values of another field. For example, I can plot streamlines that are shaded by temperature, as shown here for these buoyancy driven flows in a cavity heated from below.

The plot on the left shows a flow with thermal Peclet = 0.01 (weak coupling of flow to heat transfer). The temperature field remains horizontally stratified, departing only slightly from pure conduction. The plot on the right shows a flow with thermal Peclet = 100 (strong coupling of flow to heat transfer). Convection sweeps hot fluid towards the left wall of the cavity and cold fluid towards the right wall.

Note that coloring of arrows in vector plots is controlled by the same contour shading options. Here are vector plots created under the same settings as the plots shown above. Although the convective effect on temperature is also easily seen here, the plot on the right gives a poor visualization of the flow, in keeping with my comments above about vector plots.

I don't have a particular use for this feature in mind yet, but I am thinking it might be useful to study highly convected species mass transport in melt crystal growth where local concentration variations have a large bearing on morphological stability.

### True detective

Everyone has a story to tell.

Mine is the story of Cats2D. Follow it here.