Testing

webTiger Logo Wide

Defensive Programming

Padlock

Defensive programming is a set of techniques/guidance on best practices to ensure your applications are secure, as bug free as possible, and respond gracefully to unexpected events (i.e. software exceptions).

There is a good article on Wikipedia about defensive programming, which can be viewed as an initial explainer, but this article aims to combine the purist’s view of defensive programming with wider programming best practice to help develop secure, well tested, and intuitive apps and solutions.

In this article:

Defensive Programming for Security

When writing code, think about potential attack vectors that an actor could adopt that could compromise the security of your applications or the data they manage. Actions may not always be malicious – user error is the most common cause of problems in IT.

Secure programming isn’t just about ensuring your app has a login, or that the user needs to be authenticated. It is also about how code might be compromised, such as by overflowing memory buffers (writing more data than the app is expecting), injecting rogue commands (e.g. SQL injection attacks), what happens when large amounts of data is sent to a receiver (e.g. Web API), how a drop-in replacement service might behave differently, etc.

Refer to government and industry experts (e.g. UK’s NCSC, USA’s NIST, OWASP, etc.) and try to include some security-based testing on projects wherever possible.

Secure programming focuses on both avoiding software defects and also on actively considering how a solution might be misused to breach security, steal data, etc.

Consider a simple form on a web application for submitting data to a back-end server via an API. The UI displays the form and a user fills it out and submits it. The data is sent to the back-end API which then processes it.

Immediately, there are multiple attack vectors potentially in play here, some of which might be:

  • A lack of security between client and server (API) endpoints could lead to direct access to the back-end API by a nefarious actor, if this communications channel is not properly secured.
  • A lack of field size limits on the form would allows very large datasets to be entered and sent. This could lead to website or API performance issues, or even communications timeouts, for a start.
  • Even if the web app is secured/authenticated and the Web API requires authentication, the end-to-end solution might not be using best practices for securing access tokens, protecting against token forgery, or using secure communications channels. If a malicious actor manages to hijack a token they can gain unathorised access to the API.
  • Assuming an attacker manages to connect directly to the API, they might send large amounts of data. If the API doesn’t have sensible received packet size limits in place then this could tie up the server for long periods of time, leading to denial of service (DoS) or distributed DoS (DDoS) attacks, or even crash the server if it runs out of resources processing the requests.
  • If the field data is not checked for text injection style attacks then a malicious actor could craft field data that terminates a typical database query and injects its own queries (i.e. SQL injection attacks).

The above examples are all at application/system level, but the same considerations apply at code level too.

Managed programming environments such as .NET and Java help reduce the chances of specific attack vectors such as memory buffer overflow. This is done by the runtime handling memory management and enforcing type safety, performing bounds checking, and disposing of old memory scopes automatically.

There are plenty of cases where these runtimes may still be vulnerable to buffer overflow attacks too though. Anywhere unsafe code is explicitly defined, or where unsafe program APIs are accessed via interops (e.g. P/Invoke and .NET’s Marshal class), there is scope for buffer overflows to occur. Bugs and security vulnerabilities in the runtime itself are also potential attack vectors.

Accidentally leaking information is also a potential point of attack for hackers. For example, many default installations of WordPress support a URL query-string option where adding ?author=1 to the URL can return the login username of the author of the page. WordPress lets you set a different display name to your username to make it harder to do brute force password attacks, but this simple trick would leak your actual username to a potential hacker without them having to work very hard at all. (Incidentally, this issue can be fixed very easily by adding a redirect to the site’s htaccess file.)

Actively coding against, and testing for, potential weakness in your solutions will make them much more secure and resistant to defects.

Expect Code to Fail

It is very tempting to write code to work the way you want it to, and then write tests that verify it works the way it should. While this might seem perfectly reasonable, but no consideration is being made for what happens if the code isn’t used the way it should be.

For example, consider a method that accepts two numeric values, speed and compass bearing.

The vehicle being controlled by the software has a top speed of 15mph going forward and 5mph in reverse and the speed parameter should be set to zero to stop the vehicle, to a positive value to drive forwards, or to a negative value to reverse. The compass bearing should be in degrees (so in the range 0 to 359) based on the direction the vehicle is facing.

A programmer develops a program to control the vehicle, and then adds automated unit tests that check speeds being -5 and 15, and bearings between 0 and 359°. The software works fine.

A user runs the software to control their vehicle and enters a speed of 155 by mistake. What will the software do? What will happen if a bearing of 1800 or -180 is entered?

In the above scenario, considering best practice techniques such as out-of-bounds input range testing would provide better test coverage and give confidence that the software would behave as expected under invalid or out of bounds input values.

The next section covers recommendations for unit testing in more detail.

Ensure Robust Unit Tests

Avoid writing only ‘light-bulb’ tests (e.g. a single test within specifications, expected to pass) to confirm that code meets specifications. Consider the following when writing tests:

  • Edge cases. What happens if a null or blank value is supplied to a text parameter? What is the range of acceptable values that numberic input parameters support? And, what values within that range have you tested? What happens when the actual limits of a parameter range are supplied?
  • Out-of-bounds conditions. What happens if values outside the expected range are supplied?
  • Use realistic test data. Attempt to use realistic data for testing, instead of random text and numeric values where possible. This exercises the unit under test (UUT) more thoroughly in the type of scenario(s) it is going to be used in.
  • Size of datasets. Don’t test with small datasets if those in production are going to be significantly larger as buffer size limits, transmission time, etc. will not be properly evaluated.
  • Number of users. How many users are going to concurrently use the solution? If this is a large number, considering doing extended load testing on the solution to observe how it scales.
  • Anticipate errors. Are there likely error scenarios that you can explicitly test for, and help the solution protect itself against. For example, what happens if a Web API you are accessing is unavailable?

Implement Comprehensive Logging (if Possible)

If the project and ongoing operational costs support it, attempt to comprehensively log errors and notable other events to simplify and reduce the cost overhead of debugging activities.

Consider the following:

  • Log different message types. Don’t only log errors. Consider integrating multiple log message categories/levels into your solution, such as: debug data, information, warning, non-critical error, and critical error.
  • Configurable logging options. Make logging levels configurable, typically just allowing errors to be logged in production, but with the option to enable more verbose logging via configuration settings if the need arises (e.g. for debugging purposes).
  • Multiple logging stores. This can help ensure log data is available as comprehensively as possible. For example, in a distributed application if a database fails, and that is the only (distributed) log store being used, then very limited data may be available for debugging system issues. If the data was logged directly to the database and also to a web service with separately hosted infrastructure, then debugging data may still be accessible in the event of main database failure.
  • Local and centralised stores. Instead of relying solely on centralised logging, consider per-machine logging too. For example, each web service in a microservices solution might have its own log but they each also write to a centralised log store as well. This differs to the previous point in that the above recommends multiple centralised log stores while this point is specifically about endpoint-specific or machine-specific log stores.
  • Don’t rely on system error messages. Attempt to add value to error data if possible. For example, a system error might be something like ‘null object detected’. Without context, this isn’t very useful for debugging purposes. Having more details about the machine, app, module, method name and parameter name would add a lot of value to debugging data. For example, “Machine=UserProfiles01, App=MyApp, Module=MyLibrary, Method=MyClass.CheckNameAndBirthday, Parameters: name=[null], dob=12/11/1986. Error received: ‘null object detected’”.
  • Redact or protect confidential and personal data. When logging event related data (especially for debugging), consider the content being saved. If the data is valuable, confidential or is personal data that shouldn’t be accessible without further authorisation, then consider redacting, encrypting, or even omitting that data from the dataset being logged so it cannot be read at rest.

Implement Structured Error/Event Management

Complementary to logging errors and other events, is how to manage them. While logging errors and events is one thing, thinking about how your app/solution should respond to them is just as important.

  • Expect errors. Attempt to identify error scenarios that are likely, and that the solution should easily recover from (e.g. a web service is temporarily unavailable). Consider how logging should be performed in this scenario (e.g. if a service is unavailable, log that it is offline and then wait until the service is available again before logging a ‘service is back online’ message).
  • Handle errors gracefully if possible. This could be as simple as capturing the error and displaying an appropriate message/notification to the user, or it could involve rolling back transactions. Try to avoid letting software exceptions bubble up to the app-level as this could lead to the app terminating unexpectedly.
  • Don’t trap, log, and re-raise errors unnecessarily. Typically, you only need to capture and log errors at a logical app or service boundary. Logging an error within each method in a command chain and then re-throwing it to its parent is wasteful and can flood logs with lots of duplicate messages.
  • Keep error messages contextual. Think about how errors propagate through your code. A low level database error of something like ‘column type mismatch’ bears no relevance at the service level in a method called CheckNameAndBirthday. Mutate error messages to retain context as they travel back up the call stack (preferably retaining lower level error data too). For example, .NET’s Exception class includes an InnerException property, so a new Exception object with a more contextual error message can be thrown with the previous error as the inner exception data.
  • Consider a user friendly exception type. If the programming environment supports it, consider defining a custom exception type explicitly for use in displaying user friendly error messages. This can be useful to help display the most contextual error message to the user, instead of a more generalised error message when an exception is detected.
  • Message throttling. If a fault leads to many errors being logged in a short timeframe, consider throttling or discarding messages until the fault is resolved. For example, if a dependant service is unavailable but about 1000 requests per second are coming in on a web channel that requires that service, then it makes no sense to log 1000 errors every second. Instead consider throttling the maximum number of errors with a time period from any single source, or even log the service being unavailable and then suppress further error messages until it has recovered again.
  • Error message IDs. Although not strictly necessary, assigning each error message displayed to a user a unique ID that they can quote when contacting their application support team can make support ticket investigations more efficient. This assumes the unique ID assigned is also written to the error log store.

Take Care When Re-Using Sample Code and Using 3rd Party APIs

Most developers use the Internet to help them solve programming tasks these days. It is often quicker to search online for a solution someone else has posted to the same task/problem being tackled than writing your own code from scratch.

The proliferation of globally shared code libraries and programming frameworks can also introduce reams of unknown/unreviewed code into your solutions.

When adopting to use sample or example code from the Internet or when adopting a 3rd party library or framework, it is worth considering:

  • What is the source of the code? Is it suitable for inclusion in the codebase? Is the code really fit for purpose, and does it do exactly what the task requires? Often, code found online doesn’t work quite the way you need it to, so always make sure you have tested it against design specifications.
  • Don’t import large swathes of code. Instead of blindly importing large passages of code, as you may not fully understand how it works (if it works properly at all), consider the simple technique of reading, interpreting, and re-writing the code yourself to ensure it is completely fit for purpose and optimised for your task.
  • Only use well-known, well supported libraries/frameworks. When identifying potential 3rd party libraries or programming frameworks, consider the security threat of using such packages (they could have been infected with malware, spyware, etc.) and the source of them. Also, consider the longevity of the library and how well supported it is. A relatively new library on GitHub written and supported by a single person is much less likely to have long term support and regular updates and bug fixes than an established package that is being actively supported by a number of developers.
  • Check 3rd party libraries/frameworks for updates regularly. If you are integrating 3rd party code into your solution, proactively check for updates to the packages. Those 3rd party developers may find, report, and fix software defects in the packages regularly – including security issues – so having old versions may make your software more susceptable to malicious attacks.
  • Restrict 3rd party dependency sprawl. Don’t use 3rd party libraries unnecessarily. Developers often like what they know, and as such will often want to bring what they’ve used before from their old company to their new one. For example, if the organisation is using a certain ORM (object relational mapping) product for database interaction then it probably doesn’t make sense to introduce secondary or tertiary ones just because individual developers prefer them. Apart from the confusion with regards to which one to use on any given project, you’ve also increased your training for any new hires from one ORM tool to three.
  • Perform proof of concepts on new technology. Before using new technology, it is often valuable and cost effective to perform small, targeted, proof of concept (PoC) developments with the technology before using it on commercial projects. Always write a Terms of Reference document for the PoC to clearly set out the scope of the work, the deliverables, and the definition of success.