Home > Overview > Developments > Potemkin village

Trina Magna

I've been reflecting lately on the roles played by others in my work, particularly my past supervisors. There were three. It is a curious thing that none of them ever studied a single line of code written by me. My advisor in graduate school, Stan Middleman, did not use a computer at all. Neither did my post-doc supervisor, Skip Scriven. After that I worked for Jeff Derby. He used a computer but had little experience programming. Cats2D would be incomprehensible to him.

Stan Middleman did not pretend to be an expert at computing. Nevertheless I was surprised when he declined to look at a code I had written to obtain some results for my project. It was 1985. The task was simply to compute sums of Bessel functions to evaluate a series solution to the diffusion equation. The Bessel functions were computed by a library. My code merely added up the numbers. Hardly anything could go wrong. Still I thought my advisor might want to check my work.

Later on I wrote a finite difference code to solve Poisson's equation in two dimensions, and an orthogonal collocation code to solve the convective-diffusion equation in one dimension. I learned on my own what I needed to know from books on numerical methods. These codes weren't so simple. A lot can go wrong using computers to solve differential equations. But Middleman never looked at these codes, either. They were black boxes and he was okay with that.

Reflecting upon this today, I think that Middleman should have exercised greater diligence. But I can understand why he didn't. Dissertation advisors cannot supervise or verify every aspect of a student's work. But in those days, computers were new to everyone. Maybe it was a chicken and egg thing, which comes first? Is it an advisor who knows how to write code and teaches the student, or is it a student who is capable of learning to program on their own? Either way, computing was too impactful and too complicated to delegate so casually. A conservative attitude should have been taken towards validation and verification, with as many eyes on the ball as possible. Advisors should have been reading and running the codes their students were writing. Most were not.

Let's consider an analogy. Graduate students sometimes use laboratory techniques unfamiliar to their advisor. They learn how to use equipment from each other, or from an equipment vendor. A specialist might be consulted for certain difficult tasks, for example an expert microscopist, or a machinist. The advisor does not necessarily need to gain competency at these activities; the idea is the thing, and their role is to supervise the students intellectually. The advisor might help the student troubleshoot an experiment when things go wrong, but most of the time the underlying methods are not in doubt, it is their careful execution that matters.

Experiments make a poor analogy to software-based research, particularly when it involves complicated mathematics. Some experiments can be extraordinarily sophisticated and challenging to execute well, but the number of parts found in most experiments is tiny, the purpose of each part is generally obvious, and their interoperability is easy to understand. Most of all, experiments have well defined inputs and outputs. Prep a sample, feed it to the gas chromatograph, and wait for the peaks to emerge. Maybe your system is contaminated and the GC produces a confusing result that you didn't expect. It might be really hard to identify and eliminate the problem. But you know the output shows something about the contents of the sample. It's real.

None of this is true of software used in physics-based simulation. A typical CFD code has many thousands of parts that must interact to solve a problem. In a sense, every statement in a code constitutes a part. Even worse, many of these parts become interwoven, what we call spaghetti code. Any code longer than a few thousand lines is impossible to accurately analyze simply by reading it. Testing is necessary, but laborious. Designing and executing tests is difficult and error-prone. Behaviors that are mathematical in origin can be difficult to distinguish from issues with the numerical algorithms or their software implementation. Some parts of the code don't necessarily behave like we expect them to.

Every problem we solve requires us to modify or reconfigure our software somehow, to accommodate different solution methods, boundary conditions, constitutive laws, physical geometry, etc. We can do this by writing a series of one-off "research" codes, which usually means modifying an original base code each time we want to do something new. Alternatively we can use a more general code, probably written by others, that includes hundreds of different options from which we choose at runtime. Many options are strongly interrelated; experience and training are needed to mitigate the risk of specifying incompatible options. It is akin to rebuilding a machine every time we need to adapt it to a new purpose.

Experiments simply don't have this much complexity in terms of input, or subtlety in terms of output. No experiment requires us to choose dozens of critical settings from hundreds of options. Whether we write the code or let others do it for us, we must understand how the machine works to use it properly. Even using a well tested commercial code of high quality, it is easy to specify an ill-posed problem. When that happens, the output isn't merely unusable, it is nonsensical. Effectively it is no output at all. Computing is fraught with peril, and eternal vigilance is needed to guard against bad output.

In the early days people wrote short codes that solved simple problems in physics by today's standards. The input was straightforward, the output was fairly easy to evaluate, and the methods were simple to program. Solving a partial differential equation using Crank-Nicolson with finite differences is not hard to program, well within the reach of a self-trained graduate student having some aptitude for computers. This is what people were doing in the 1980's in research universities and national laboratories. Back then it was both current and useful.

The modern target is to solve complex problems in multiphysics, often using realistic geometries. All manner of physical phenomena might be included, operating simultaneously in different materials or chemical phases. This is how the commercial simulation market has evolved. Self-trained graduate students are not capable of writing codes nearly so complex. In response, principal investigators at research universities have all but thrown in the towel; either they use commercial codes, or they severely limit themselves to a narrow class of problems they learned to solve in their training years. They struggle to remain current, and they aren't very useful. Reliability of published results is a major concern.

Why did this happen? Programming is a difficult and time consuming task that many people find unpleasant. Not everyone gets it. There are very smart people who know their their singular expansions inside-out who lack the patience or aptitude to write a working code. If you are writing good code, then you aren't writing proposals and publishing papers. From the beginning, principal investigators have faced powerful disincentives to invest time in software development. Attitudes were set permanently in amber. The idea is the thing, and the principal investigator provides the ideas. Subordinates write the code. Computing was destined to remain the poor stepchild to physics and mathematics, and it happened more for convenience than it did for cause.

This way of compartmentalizing things failed after a few student generations. The synergy between physics, mathematics, and programming is strong. The best ideas take form when we think about them together. If you don't write code, you won't generate good ideas for software development. In fact you will develop bad ideas. Principal investigators weren't programming, they were bike shedding an itinerant labor force of graduate students, most of whom grew tired of programming under weak leadership and getting little credit for it. Programming was treated as a support activity---intellectual value did not reside there. This self serving attitude fostered intellectual stasis and promoted resistance to genuine progress. Each new generation of students learned less about computing than the previous one.

Things would have gone much better if a framework had been devised to professionalize computing in research. Graduate students who demonstrated an aptitude for physics-based computing should have been hired into faculty positions specifically to develop a curriculum to train themselves and their students in software development. Such a curriculum would need to be progressive and nimble to keep up with the constant expansion of computing power. The need for this should have been apparent by the 1990's. The few attempts I've seen along these lines have been woefully inadequate. No one has been willing to divert enough resources to succeed at such a time-consuming effort. The mandate to publish traditional research articles remains the primary activity of the principal investigator.

The National Science Foundation, the agency most responsible for training in scientific computing, is hostile to professionalizing computing in research. Instead NSF has taken an "if you build it, they will come" attitude to high-performance computing, paying a large amount of money to support supercomputing facilities outfitted with expensive hardware. Users of these systems do not need to demonstrate competence to use them. Anyone who can achieve a bare bones grasp of computing jargon can write a successful proposal for time on these systems. Actual skill and experience are irrelevant. I used supercomputers for many years, at four different supercomputer centers, starting in 1985. Incompetence is the norm on these systems.

My post-doc advisor, Skip Scriven, believed that it was purely procedural to cast equations written on a page into a computer program, akin to translating Spanish to English. He considered himself an expert in computing, and many of his colleagues agreed. He never used a computer for anything. I'm not sure he could figure out how to turn one on. Would John Q. Public think it sounds normal that an "expert" in computing, a "Fellow" of the Minnesota Supercomputer Institute, had never used any sort of computer in his entire life? It sounds odd to me. Are there surgeons who never cut a patient? Are there lawyers who never write a legal document?

My final supervisor, Jeff Derby, viewed programming much the same way as Scriven: procedural and beneath the station of a principal investigator. In his view, one piece of software could be replaced with another if needed. He once explained to me that it was inappropriate for a certain junior professor in the department to spend significant time working on a computer program. This particular professor was not recommended for tenure, and Derby brought up this issue as a contributing factor. The idea is the thing; the code is something you manage at arm's length. This attitude prevails among the faculty in Derby's department, and in most other departments as well. Derby's own advisor, Scriven protégé Bob Brown, followed this practice in his research group at MIT.

The three of them generated research, a lot of it, and published hundreds of papers about it. None of them developed software of any durability, or left much of a legacy in scientific computing. Most of their students ended up in careers that were, at most, peripherally connected to their graduate school work in computing. Considering that their research programs were largely built on physics-based computing, this outcome seems incongruous. The failure to develop an effective training regime, the failure to pass skills from one generation to the next, and the failure to recognize software as an end unto itself, greatly dulled the impact these groups might have had on physics-based computing.

Scriven once earnestly assured me with somber authority that he knew far more about computing than I did. He said that he had been supervising this kind of work for a long time. He said that he had a very good idea how much time it should take to complete the tasks he had assigned to me, and that I was taking far too long to do it. In his mind, a three month project should be done in a week. Scriven was a notorious gaslighter, but these remarks about programming had no basis in reality and were simply outlandish.

Is this example extreme? Somewhat, yes, but the general attitude is typical. Faculty principal investigators chronically overestimate their expertise and accomplishments in computing, often by a wide margin. Most of them rely too much on their students, sometimes blindly. They lack basic skills themselves and are unable to provide adequate training to their students. It is hardly a surprise that they often fail to credit their students adequately. Frankly, many of them are stuck in the 1980's, in a sort of perpetual adolescence.

Nowadays I see PIs repeating the same work of the past. They create a sense of progress by focusing on the latest hardware-based solutions, e.g. parallel computing and GPU computing, while failing to make any advances in usability, utility, robustness, or anything else that might be valuable in the long run. They write bad software for expensive hardware paid for by other people. The real setback comes when they fluff every positive result, and ignore every negative result, of their efforts. It makes things very hard on people who want to pursue sustainable development of reliable application software based on sound methods. They've killed the enterprise.