Mark Youngman's Website

Code is the tool, not the goal

26 February 2024

I heard a story about a health-sector project where a sole contractor created a database and frontend to hold patient data for some department. By all accounts, this contractor did a horrible job. Staff would frequently forget to input a patient's data into the system, and even when they did, the system didn't hold all the information that staff required.

One of the nurses, realising the new system's inadequacies, took it upon herself to create a spreadsheet to do the job instead. It turns out she could do a better job than the contractor, in less time, at zero cost. She had no coding ability, but she did have one thing the contractor never had: a focus on the actual problem. She kept the spreadsheet up-to-date, and ensured it contained all the information required by staff.

Why did the contractor fail? Because they never understood the problem. The problem required more than just code: it required a process to ensure that all patient's data got added and a better understanding of what data staff needed.

It seems so obvious when described like this, and we may dismiss this particular contractor as an idiot - they probably are, and rich too. But I think this lack of focus on the problem is an epidemic amongst programmers. I think most programmers are much more concerned with things like letting their favoured programming paradigm needlessly limit the design space they have to work with or finding a poor excuse to use an obscure language feature. They don't care about the problem. They are obsessive about the code.

While they focus on the code, they miss the obvious: users want solutions, not code. Code is the tool with which we produce and maintain a solution. Users don't care if your code impresses your peers. They only care that it works and does what they want.

And yet programmers spend inordinate amounts of time on minutia or creating elaborate, entirely unnecessary abstractions, or worse still getting caught in a groundhog-day doom loop of new approaches for unmeasured benefits, each approach introducing new unforeseen problems that require yet another new approach... discarding the expertise with the previous approach, and forcing them to learn and deal with the idiosyncratic issues that inevitably arise with the new approach - at least until the next approach appears.

To solve a problem like the nurse did requires humility. She found a simple elegant solution to a relatively simple problem. She didn't reach for a sonic screwdriver when a simple butter knife did the job.

Maybe one day, a programmer's skill won't be judged by the cleverness of their code, but by the only true metric for programming skill: the amount of user value they ship. For now, companies will continue to feel the need to keep programmers on a tight leash, and for good reason. That could be avoided if programmers didn't fundamentally misunderstand their role. For now, however, I suspect the previously-mentioned contractor is far from the last programmer who will be outshone by a nontechnical person.




Problems in a post "Hello world!" world

21 October 2023

When someone starts programming today, they often deploy and wrangle other people's code. As they begin that undertaking, they're also often met with a culture that values code over end-user value. These early experiences shape a person's understanding of what programming is and how it is done.

Beginners are provided instant gratification. They are rewarded for skipping the fundamentals of programming - most notably the ability to understand and follow program state during execution. There is little incentive to ever take a step back and learn those fundamentals.

This has led many programming subcultures to believe going low level and/or rolling your own is a mistake. It's easy to understand why: starting from scratch takes considerable skill and knowledge. Your first few attempts will inevitably be failures, and shine a light on a long path ahead. It's considered a fool's errand. Why continue on that path when using code someone else has written offers an easier route?

This lack of fundamentals has much longer term consequences than a Fizz Buzz test in a job interview. In my view, someone who primarily plumbs together unknown code will introduce unnecessary complexity, because they don't understand what the underlying computer is actually doing, and therefore will judge their solution on misguided metrics, such as the look or the cleverness of the code. The tool is no longer used to achieve an end, but becomes the primary consideration. The actual functionality of their programs, the end user experience, is secondary.

Code plumbing

Code plumbing rewards hacking over understanding. The code being used is unknown, complex, voluminous. Instead of understanding the code, which would take years, people are incentivised to use trial and error until they get the result they want. Even as a programmer improves, no matter how intelligent they may be, their instinctive habit will be to hack.

Another factor is that they won't develop good metrics for software quality. When plumbing, you don't need to learn to build good, maintainable abstractions, since any abstractions you do create will tend to be fairly shallow. This means the full catastrophy of a bad abstraction isn't revealed. And then the wider culture starts celebrating particular abstractions in books, and the authority held by those abstractions ends up trumping concerns about whether they suit the particular context in which they are being used.

One personal anecdote of this is a software architect that complained the developers were "ruining the purity of [their] vision". In reality, the architect's vision didn't survive contact with reality, but the architect refused to reassess whether their suggested abstraction was appropriate, despite being shown the problems with it.

This boils down to having a bad measure of software quality. And results in programmers chasing metrics that are counterproductive:

"Programmers are bright people who are (often justly) proud of their ability to handle complexity and juggle abstractions. Often they compete with their peers to see who can build the most intricate and beautiful complexities. Just as often, their ability to design outstrips their ability to implement and debug, and the result is expensive failure." - Douglas McIlroy quoted in The Art of Unix Programming

Back to life. Back to reality

This state of affairs is living on borrowed time, because ideas formed on the basis of a disconnection from what the code does or bad metrics for software quality will ultimately lose in the battle of ideas as the projects using those ideas fail.

How long it will take for them to lose that battle is another question. In medicine, bloodletting was misused for (according to Wikipedia) over 2,000 years. So for all that time, a practice widely considered beneficial was actually harming patients. I don't know what the solution is, but if we don't want the same for programming, we need to develop the programming equivalent of double-blind trials.




To framework or not to framework? That's the question

20 June 2023

For particular problems, you can hit the ground running by using a framework. They allow programmers who aren't domain experts to produce working solutions to common yet difficult problems. You can feel safe that many others are using the same code, providing some assurance around things like security and scalability.

When you first start a project using a framework, it can feel like it's giving you many things for free. And it is. But even with a good framework, there is a longer-term cost to be paid:

Building from scratch

I think the difficulty of building from scratch is generally overestimated and the on-going cost of using a framework is generally underestimated. If a good custom foundation is established, I think programmer productivity and the quality of the solution can skyrocket:

All that said, the early stages of a project will need much more care. The quality of the foundation will be determined by your skill and knowledge. It requires knowledge of the problem domain you're working in and/or the skill to quickly adapt to new problems you're unfamiliar with. It requires research of possible solutions, and then careful selection of the right approach for your project. Even if the difficulty is generally overestimated, it's still far from trivial.

If that gauntlet can be navigated, the rewards are potentially huge. But if things go wrong, it could be catastrophic. This risk is unacceptable for many projects.

When is a framework the right choice?

I think frameworks are the right tool for many projects, particularly those involving a team of a programmers. But even then, if your team lacks knowledge of the problem domain you're tackling or a deep knowledge of the framework, mistakes will be made. However, with a few experienced programmers, I think the chances of delivering a working solution are increased dramatically, although the quality of that solution may be compromised - the sluggish responsiveness of most mobile apps comes to mind.

Frameworks also provide a tried-and-tested solution as many issues will have already been ironed out. From scratch, unless you're deeply familiar with a particular problem space, you will need many iteratations to reach a similarly usable and reliable solution. And if you do introduce issues, they may only become apparent later, when it has become costly to fix them.

When is a framework the wrong choice?

There are a few reasons to not use a framework:

Final thoughts

We may be moving towards a future where a generalist programmer who can tackle any problem is increasingly rare or even unfeasible. With quantum programming, maybe it's already impossible.

But even in an x64/arm world, despite any commonalities, 3D rendering and creating a website, for example, are unique problems with their own quirks. Our understanding of how to best tackle such problems has been improved by past failures. And maybe in the future the best way to solve such problems won't be to study that history in detail yourself, but to use a framework that has collected proven solutions.

Maybe some frameworks will reach a level of maturity - not just be this decade's fashionable trend - such that the benefits of avoiding them, in most cases, won't be worth the cost.

That said, I currently think that the quintessential skill of a programmer is the ability to successfully navigate a new problem domain.

There are mistakes that can be made with frameworks, like not following a framework's established way of handling a particular problem. But putting such mistakes aside, even in a world where frameworks are used by most projects, all projects have their idiosyncracies. Programmers earn their pay when tackling a problem that the framework or a dependency doesn't solve. That's when months or even years of developer time is lost or won.

For me personally, I have no intention of using frameworks for personal projects, even if there is a framework for the problem I'm trying to solve. It does increase the difficulty, but I think this is the right decision for someone primarily interested in learning their craft, and who wants to reap the longer-term rewards in solution quality and productivity.




Who to trust when learning programming?

21 May 2020

My experiences online suggest two people from separate programming subcultures, each with multiple decades of experience, who articulately defend themselves, can disagree on fundamental issues. Both are convinced that their way is unquestionably correct. Their group's ideas are generally viewed as sacred, and yet they often contradict those of others groups.

Why do they disagree? And how do we decide who is right? I don't think there are any clear answers, but it's interesting to think about.

When experts disagree

I think experienced programmers disagree because programming is (a) still in its infancy and (b) hard.

A programmer can only work on a handful of large projects in their lifetime, and it can take years before they receive feedback from their choices. This means that even a veteran programmer has little evidence to determine what's best -- and ambiguious evidence at that. Further more, since programmers tend to follow a specific culture, it's likely that the projects they've worked on all followed a similar methodology, and so they have limited experience of other approaches.

I think in most instances, programmers don't know much beyond the culture they were raised in. An embedded systems programmer is unlikely to understand the problems of a web programmer and vice versa. One is working in a jungle, and the other is working in the desert. One gets incredulous that the other doesn't wear suncream.

How do we choose who to trust?

A better question might be "Who shouldn't we trust?" Teaching is a big industry: books, conferences, bootcamps, university degrees. Anyone can give a talk or write a book. "Those who can, do; those who can't, teach". And the teachers themselves disagree with each other.

If you dig into the CVs of well-regarded programming teachers, they often don't provide strong evidence for why you should trust them.

I have more respect for those that survey across projects, but that's still weak data and anecdotes. I think there are too many variables between different teams and projects to isolate what works. There are no double-blind studies here, and I think if we're going to reach sound conclusions, that's the rigour required.

What qualities should a trusted authority have? Maybe they've programmed much more than others, completed multiple large, complex projects to a high standard? How high is writing lots of books and giving lots of talks on that list?

If we trust people with a long, proven track record, that is no guarantee in itself. Even if correct, they will tell you what worked for them when solving their problems, when working within their environment.

The fundamentals of programming

An approach that has served me well when learning a new subject is to start at the beginning. When the foundations are dodgy, the elaborate structure on top is irrelevant.

Currently, my personal conclusion, rightly or wrongly, is that programming is primarily about solving data transformation problems. I find the more I focus on the data, the clearer things are. Code is easier to read and maintain if the transformation performed naturally arises out of the problem being solved, when self-expression is minimised.

Programming is an engineering discipline, not an art. Any solution should say more about the problem than it does about the person who solved it.

The power of "I don't know"

I like people who are ready to admit that they don't know. This subject is so large that nobody can know it all, and so varied that there are few absolutes. Like with anything else, an ego prevents learning, causes overconfidence, and leads to ostentation.

With so many experienced programmers disagreeing, I think the only sensible approach is to be slow to judge, to avoid dogma, to expand your horizons, and to always doubt your own approach.




Code readability and data flow

25 April 2020

While working on a test framework a while back, a particular piece of functionality inspired some thought about the relationship between the flow of data and code readability. I found that iterating towards less code meant the code, as written, no longer represented the flow of data. It was therefore left to the reader to work out the flow themselves.

The functionality in question

You can see code at the time of writing here.

The test framework allows users to create test suites by subclassing the TestSuite class. They can then add tests to that subclass by creating a new method and prefixing

test__
to its name.

To implement this in Python, we need to:

It turns out this can be achieved concisely:

for test_name in filter(lambda name: name.startswith('test__'), dir(self)):
  self.tests_to_run.append(getattr(self, test_name))

After starting with something much longer, it's satisfying code to iterate to, but is it good?

Clever code is dumb code?

If the reader has knowledge of Python -- understands filter, dir, etc. -- they may not need a search engine. But even so, they still have unpacking to do.

If we put the code aside and consider how the data is processed, it starts at the dir(self), which returns a list of method names for the current class. It's the first thing we need to progress towards what we want: a list of test methods to run. But this isn't the first thing a reader would see. Instead, dir(self) is at the end of the first line, nested inside a function call.

This means that to understand this code, the reader is forced to read the code multiple times, jump back and forwards, to identify the flow of data for themselves.

Target audience

Even for an experienced Python developer it would require some mental overhead, and that all adds to the overhead of the code's larger context. People can only hold so many things in their head at once.

This reaches to a larger topic: who is the target audience and what does the coder wish to communicate to them. I believe the same issue is found in writing.

Consider The School by Donald Barthelme. On the surface the writing seems simple, but if you try to emulate his style it become apparent that behind that simplicity is great skill.

Compare Barthelme with Irene Iddesleigh by Amanda McKittrick Ros. Ros is out to impress, to the comical detriment of the writing itself.

Writing often wants to be stylish, to entertain, but I think code should to be workman's prose. It shouldn't draw attention to itself.

The narrative of the data

For me, the problem with the above code is that it prioritises conciseness over accurately representing the flow of data. The reader wants to understand the narrative of the data -- its journey from start to end.

Here is the code I ended up with:

method_name_list = dir(self)
is_test = lambda name: name.startswith('test__')
test_name_list = filter(is_test, method_name_list)
for name in test_name_list:
  method = getattr(self, name)
  self.tests_to_run.append(method)

The transformation of data can be followed in order by reading from top to bottom -- it matches the steps I gave earlier. And with the additional variable names, it's possible for a reader unfamiliar with the in-built functions called to infer what the code does.

I like the idea that code should be accessible to the inexperienced. I'm going to try and stick to this when coding in the future.




Follow Mark on Nostr