12 Programming Mistakes to Avoid
A car magazine once declared that a car has "character" if it takes 15 minutes to explain its idiosyncrasies before it can be loaned to a friend. By that standard, every piece of software has character -- all too often, right of the box.
Most programming "peculiarities" are unique to a particular context, rendering them highly obscure. Websites that deliver XML data, for example, may not have been coded to tell the browser to expect XML data, causing all functions to fall apart until the correct value fills the field.
[ Also on InfoWorld: Find out which 7 programming languages are on the rise in today's enterprise. | Keep up on key application development insights with the Fatal Exception blog and Developer World newsletter. ]
But certain programming practices send the majority of developers reaching for their hair upon opening a file that has been exhibiting too much "character." Spend some time in a bar near any tech company, and you'll hear the howls: Why did the programmer use that antiquated structure? Where was the mechanism for defending against attacks from the Web? Wasn't any thought given to what a noob would do with the program?
Creatures of habit, we developers seem locked into certain failure modes that can't be avoided, such is the frequency with which we fall prey to a particular poor programming practice.
Below you will find the most common programming pitfalls, each of which is accompanied by its opposing pair, lending further proof that programming may in fact be transforming into an art -- one that requires a skilled hand and a creative mind to achieve a happy medium between problematic extremes.
Failing to shore up the basics is the easiest way to undercut your code. Often this means overlooking how arbitrary user behavior will affect your program. Will the input of a zero find its way into a division operation? Will submitted text be the right length? Have date formats been vetted? Is the username verified against the database? Mistakes in the smallest places cause software to fail.
The worst part about sloppy programming is that advances in language design aimed to fix these problems don't do their job. Take the latest version of Java, which tries to make null-pointer checking easier by offering shorthand syntax for the endless pointer testing. Just adding a question mark to each method invocation automatically includes a test for null pointers, replacing a rat's nest of if-then statements, such as:
In the end, however, such syntax improvements can only prevent code from crashing, not ensure that it's useful. After all, it doesn't eliminate the root of the problem: the proliferation of null values due to fast and loose programming.
On the flip side, overly buttoned-up software can slow to a crawl. Checking a few null pointers may not make much difference, but some software is written to be like an obsessive-compulsive who must check that the doors are locked again and again so that sleep never comes.
Relentless devotion to detail can even lock up software if the obsessive checking requires communicating with a distant website over the network. I have several packages that slow to a crawl if I fire them up on a laptop without a Wi-Fi connection because they're frantically trying to phone home to see if a new version might be available. The Wi-Fi LED flickers, and the software hangs, constantly looking for a hotspot that isn't there.
The challenge is to design the layers of code to check the data when it first appears, but this is much easier said than done. If multiple developers work on a library or even if only one does all of the coding, it's difficult to remember whether and when the pointer was checked.
Too often, developers invite disaster by not simplifying control over tasks in their code.
Mike Subelsky, one of the co-founders at OtherInBox.com, is a keen advocate of there being one and only one place in the code for each job. If there are two places, odds are someone will change one but not the other. If there are more than two, the odds get even worse that someone will fail to keep them all working in the same way.
"Having worked on one code base for three-plus years, my biggest regret is not making the code more modular," Subelsky says. "I've learned the hard way why the Single Responsibility Principle is so important. I adhere to it strongly in new code, and it's the first thing I attack when refactoring the old code."
Subelsky, as you may surmise, is a Ruby on Rails programmer. The framework encourages lean code by assuming most of the structure of the software will fall into well-known patterns, a philosophy that Rails programmers often summarize as "convention not configuration." The software assumes that if someone creates an object of type Name with two fields first and last, then it should immediately create a database table called Name with two columns, first and last. The names are specified in only one place, avoiding any problems that might come if someone fails to keep all of the layers of configuration in sync.
Sometimes the magic tools lead only to confusion. By abstracting functionality and assuming what we want, frameworks can all too often leave developers at a loss for what's gone wrong in their code.
G. Blake Meike, a programmer based near Seattle, is one of many developers who finds over-reliance on automated tools such as Ruby on Rails a hindrance when it comes to producing clean code.
"Convention, by definition, is something outside the code," Meike says. "Unless you know Ruby on Rails' rules for turning a URL into a method call, for instance, there is no way, at all, that you will ever figure out what actually happens in response to a query."
He finds that reading the code often means keeping a manual close by to decipher what the code is doing behind his back.
"The rules are, while quite reasonable, not entirely trivial. In order to work on a Ruby on Rails app, you just have to know them. As the app grows, it depends on more and more of these almost-trivial bits of external knowledge. Eventually, the sum of all the almost-trivial bits is decidedly not trivial. It's a whole ecosphere of things you have to learn to work on the app and remember while you are debugging it," he says.
Next page: How much trust to put in a client, and how not to reinvent the wheel
To make matters worse, the frameworks can often leave you, and any who come after you, stranded with pretty code that's difficult to understand, revise, or extend.
As Mike Morton, another programmer, explains, "They carry you 90 percent of the way up the mountain in a sedan chair, but that's all. If you want to do the last 10 percent, you'll need to have thought ahead and brought oxygen and pitons."
Many of the worst security bugs appear when developers assume the client device will do the right thing. For example, code written to run in a browser can be rewritten by the browser to execute any arbitrary action. If the developer doesn't double-check all of the data coming back, anything can go wrong.
One of the simplest attacks relies on the fact that some programmers just pass along the client's data to the database, a process that works well until the client decides to send along SQL instead of a valid answer. If a website asks for a user's name and adds the name to a query, the attacker might type in the name x; DROP TABLE users;. The database dutifully assumes the name is x and then moves on to the next command, deleting the table filled with all of the users.
There are many other ways that clever people can abuse the trust of the server. Web polls are invitations to inject bias. Buffer overruns continue to be one of the simplest ways to corrupt software.
To make matters worse, severe security holes can arise when three or four seemingly benign holes are chained together. One programmer may let the client write a file assuming that the directory permissions will be sufficient to stop any wayward writing. Another may open up the permissions just to fix some random bug. Alone there's no trouble, but together, these coding decisions can hand over arbitrary access to the client.
Sometimes too much security can lead paradoxically to gaping holes. Just a few days ago, I was told that the way to solve a problem with a particular piece of software was just to "chmod 777" the directory and everything inside it. Too much security ended up gumming up the works, leaving developers to loosen strictures just to keep processes running.
Web forms are another battleground where trust can save you in the long run. Not only do bank-level security, long personal data questionaires, and confirming email addresses discourage people from participating even on gossip-related sites, but having to protect that data once it is culled and stored can be far more trouble than it's worth.
Because of this, many Web developers are looking to reduce security as much as possible, not only to make it easy for people to engage with their products but also to save them the trouble of defending more than the minimum amount of data necessary to set up an account.
My book, "Translucent Databases," describes a number of ways that databases can store less information while providing the same services. In some cases, the solutions will work while storing nothing readable.
Worried about security? Just add some cryptography. Want your databases to be backed up? Just push the automated replication button. Don't worry. The salesman said, "It just works."
Computer programmers are a lucky lot. After all, computer scientists keep creating wonderful libraries filled with endless options to fix what ails our code. The only problem is that the ease with which we can leverage someone else's work can also hide complex issues that gloss over or, worse, introduce new pitfalls into our code.
Cryptography is a major source of weakness here, says John Viega, co-author of "24 Deadly Sins of Software Security: Programming Flaws and How to Fix Them." Far too many programmers assume they can link in the encryption library, push a button, and have iron-clad security.
But the reality is that many of these magic algorithms have subtle weaknesses, and avoiding these weaknesses requires learning more than what's in the Quick Start section of the manual. To make matters worse, simply knowing to look beyond the Quick Start section assumes a level of knowledge that goes beyond what's covered in the Quick Start section, which is likely why many programmers are entrusting the security of their code to the Quick Start section in the first place. As the philosophy professors say, "You can't know what you don't know."
Then again, making your own yogurt, slaughtering your own pigs, and writing your own libraries just because you think you know a better way to code can come back to haunt you.
"Grow-your-own cryptography is a welcome sight to attackers," Viega says, noting that even the experts make mistakes when trying to prevent others from finding and exploiting weaknesses in their systems.
So, whom do you trust: yourself or so-called experts who also make mistakes?
The answer falls in the realm of risk management. Many libraries don't need to be perfect, so grabbing a magic box is more likely to be better than the code you write yourself. The library includes routines written and optimized by a group. They may make mistakes, but the larger process can eliminate many of them.
Programmers love to be able to access variables and tweak many parts of a piece of software, but most users can't begin to even imagine how to do it.
Take Android. The last time I installed a software package for my Android phone, I was prompted to approve five or six ways that the software can access my information. Granted, the Android team has created a wonderfully fine-grained set of options that let me allow or disallow software based on whether it requires access to the camera, tracks my location, or any of a dozen other options. But placing the onus on users to customize funtionality they do not fully understand can invite disaster in the form of inadvertant security holes and privacy violations, not to mention software that can prove too frustrating or confusing to be of use to its intended market.
The irony is that, despite being obsessed with feature check lists when making purchasing decisions, most users can't handle the breadth of features offered by any given piece of software. Too often, extra features clutter the experience, rendering the software difficult to navigate and use.
Some developers decide to avoid the trouble of too many features by offering exactly one solution. Gmail is famous for offering only a few options that the developers love. You don't have folders, but you can tag or label mail with words, a feature that developers argue is even more powerful.
This may be true, but if users don't like the idea, they will look for ways to work around these limitations -- an outcome that could translate into security vulnerabilities or the rise of unwanted competition. Searching for this happy medium between simple and feature-rich is an endless challenge that can prove costly.
One of the trickiest challenges for any company is determining how much to share with the people who use the software.
John Gilmore, co-founder of one of the earliest open source software companies, Cygnus Solutions, says the decision to not distribute code works against the integrity of that code, being one of the easiest ways to discourage innovation and, more important, uncover and fix bugs.
"A practical result of opening your code is that people you've never heard of will contribute improvements to your software," Gilmore says. "They'll find bugs (and attempt to fix them); they'll add features; they'll improve the documentation. Even when their improvement has been amateurly done, a few minutes' reflection by you will often reveal a more harmonious way to accomplish a similar result."
The advantages run deeper. Often the code itself grows more modular and better structured as others recompile the program and move it to other platforms. Just opening up the code forces you to make the info more accessible, understandable, and thus better. As we make the small tweaks to share the code, they feed the results back into the code base.
Millions of open source projects have been launched, and only a tiny fraction ever attract more than a few people to help maintain, revise, or extend the code. In other words, W.P. Kinsella's "if you build it, they will come" doesn't always produces practical results.
While openness can make it possible for others to pitch in and, thus, improve your code, the mere fact that it's open won't do much unless there's another incentive for outside contributors to put in the work. Passions among open source proponents can blind some developers to the reality that openness alone doesn't prevent security holes, eliminate crashing, or make a pile of unfinished code inherently useful. People have other things to do, and an open pile of code must compete with hiking, family, bars, and paying jobs.
Opening up a project can also add new overhead for communications and documentation. A closed source project requires solid documentation for the users, but a good open source project comes with extensive documentation of the API and road maps for future development. This extra work pays off for large projects, but it can weigh down smaller ones.
Too often, code that works some of the time is thrown up on SourceForge in hopes that the magic elves will stop making shoes and rush to start up the compiler -- a decision that can derail a project's momentum before it truly gets started.
Opening up the project can also strip away financial support and encourage a kind of mob rule. Many open source companies try to keep some proprietary feature in their control; this gives them some leverage to get people to pay to support the core development team. The projects that rely more on volunteers than paid programmers often find that volunteers are unpredictable. While wide-open competitiveness and creativity can yield great results, some flee back to where structure, hierarchy, and authority support methodical development.
Read more about developer world in InfoWorld's Developer World Channel.