When is Optimization a Security Feature?

It is an axiom of the SW Security field that “Security comes at a Cost.” That cost is usually in the form of larger and slower applications; if stronger protections are desired, you can expect a potentially dramatic hit on your application size and responsiveness. However, there is a class of protection techniques that actually improve size and performance because they are fundamentally optimizations.

My background is in compilers; I spent part of my career building compiler back-ends (the code-generator phase) and optimizers, including a stint at IBM where our team was at the forefront of the challenging arms-race to build the fastest super-computers on the planet in order to model everything from the weather to nuclear fusion reactions. Of course (expensive) hardware was a key part of this, but HW is no use without SW to run on it. Compilers, particularly the optimizer phase, were essential to extract the maximum potential out of these powerful CPUs. The IBM compiler technology was very good at optimization, but one downside of this was that the resulting application code - while very small & fast - was very challenging to debug, often even crashing the debuggers of the time. This is because the optimizers were taking human-readable (because human-written!) code, organizing it into modules and functions for ease of understandability/maintainability, and essentially melting it into a giant sea of spaghetti-code which could no longer be understood by a human.

Optimizers don’t set out to make code hard to understand, but they will do whatever they can to make code faster/smaller. As a side-effect to optimizing, they effectively “destroy” patterns and information that humans usually rely on when they study code. A good example is function-inlining which removes the overhead of parameter marshalling and CALL instructions, but also eliminates the organization of code into functions.

And it is not only humans who can’t understand highly optimized code – the compiler itself can fall victim to this effect! A thought experiment everyone seemed to consider at one point was “Could we achieve more optimization by running the optimizer again on the already optimized code?” As it turns out, this was not a good idea: In order to do its job (optimize the code without changing its behavior), the optimizer has to understand precisely what the input code is doing. As a result of the information-destroying side-effect I mentioned, the optimizer often can’t “understand” optimized input-code and either can’t perform any further optimizations or - in the worst case – the optimizer just crashes.

Moving on to mobile application security, one of the primary objectives is to make it very hard for an attacker to understand the application code. It’s a simple idea: If you can’t understand the code, you can’t locate sensitive code and data; therefore, you can’t attack it. There are various tools in the hacker’s tool-belt for analyzing and understanding application code, ranging from code-dumpers, de-compilers to debuggers, hooking-frameworks, etc. All of these tools have a tough time with highly optimized code. This suggests that a valuable approach to pursue in protecting an application is to optimize the crap out of the code, relying on the information-destroying side-effect to make the attacker’s job that much harder by denying them the use of some of their analysis tools.

In the Java world, ProGuard was built as a Bytecode optimizer to solve the problem that the Java compilers of the time just naively translated Java source to Bytecode which often resulted in unnecessarily bloated applications. One simple optimization that ProGuard provides is name-minimization: The names of classes and methods actually show up in the compiled class-files, and the length of the names contributes to the size of the file, so shortening the names can reduce the file-size. It turns out “compressing” these names (so that they are both short and don’t collide accidentally with each other) results in seemingly random-seeming characters which translates the original human-readable names into gibberish. These sorts of names make human analysis of the resulting application code more challenging, so this is clearly a security feature.

In the Native code world (C/C++, Swift, etc.) we also have simple and effective optimization techniques that provide security benefits, with Symbol-stripping and Function Inlining being two obvious examples. There are refinements of these techniques that can provide additional security without compromising the optimization benefits, e.g. for Function Inlining, we can diversify the instruction layout of the inlined code at each call site, which increases the reverse-engineering burden on the attacker without any cost in code-size.

So, in conclusion, it seems that the well-founded belief that Application security always comes at a cost in size and speed does not always hold. There are App Hardening techniques that are a Win/Win, both secure and optimized. It seems that in some cases, you can have your (security) cake and eat it too!

1 Like