Wednesday, January 28, 2015

Required Reading: Iron Clad Java

They didn't teach appsec in Comp Sci or in engineering or MIS or however you learned how to program. And they probably still don’t. So how could you be expected to know about XSS filter evasion or clickjacking attacks, or how to really store passwords safely.

Your company can’t afford to send you on expensive appsec training, and you’re too busy coding anyways. Read a book? There hasn't been a good book that explains how to write secure Java in, well… ever.

But all that’s changed. Now you learn how to build a secure Java app at your desk or on the train or on the toilet.

Iron Clad Java, by Jim Manico and August Detlefsen, has arrived. This is a master class in secure Java design and coding, written for developers by guys who truly know their shit.

While it is focused on web apps, a lot of the book applies equally to mobile, Cloud, real-time and back-end systems, any kind of online system in Java.

There’s no time wasted on theory. Iron Clad Java explains the most common and most dangerous attacks and how to defend against them, using straight forward patterns and Open Source libraries and free tools from OWASP.

Each chapter is short and easy to read, with practical, up to date (as of Java 8) information and sample code:

  1. Fundamentals of web app security: HTTP/S, validating input
  2. Access control: common anti patterns and mistakes, how to design access control for single company or multitenant apps, how to use Apache Shiro and Spring Security
  3. Authentication and session management: you shouldn’t be writing this code on your own (this is what frameworks are for), but if you have to, here’s how to do it, as well as how to handle remember me and forgot password features, multi-factor authentication and more
  4. XSS defense: how to use the OWASP Java Encoder, HTML Sanitizer and JSON Sanitizer libraries and JQuery encoding
  5. CRF defense and Clickjacking: random tokens and framebusting
  6. Protecting sensitive data: how to do signing and crypto correctly, using Google KeyCzar and Bouncy Castle
  7. SQL injection and other kinds of injection: prepare your statements
  8. Safe file upload and file i/o
  9. Logging and error handling: what to log, what not to log, logging frameworks, safe error handling, using logging for intrusion detection
  10. Security in the SDLC

So no more excuses.

Monday, January 26, 2015

If you got bugs, you’ll get pwned

The SEI recently published some fascinating research which shows a clear relationship between software quality and software security.

The consensus of researchers is that at least half, and maybe as many as 70% of common software vulnerabilities are fundamental code quality problems that could be prevented by writing better software. Sloppy coding. Not checking input data. Bad – or no – error handling. Brackets in the wrong spot... Better code is more secure.

Using Bug Counts to Predict Security Vulnerabilities – and vice versa

The more bugs you have, the more security problems you have.

Somewhere between 1% and 5% of software defects cause security vulnerabilities. Which means you can get a good idea of how secure an application is based on how many bugs it has.

If you do everything right:

  1. Developers are trained in secure development so that they can prevent – or at least find and fix – security problems
  2. The system is designed and built with a deliberate focus on quality and security
  3. You collect/measure defect data and use it to assess and improve your development practices
Then you should expect to find only 1 security vulnerability for every 100 (give or take) bugs. If you're not paying attention to security and quality, then the number of security vulnerabilities in the code will obviously be much higher. The more bugs that you find, the more security vulnerabilities you have somewhere in the code, still waiting to be found.

Heartbleed and Goto Fail = Bad Coding

The SEI looked at recent high profile security vulnerabilities including Heartbleed and the Apple “goto fail” SSL bug, both of which were caused by coding mistakes that could have and should have been caught in code reviews or thorough unit testing (read Martin Fowler’s exhaustive analysis here). No black hat security magic here. Just standard, accepted good development practices.

This research also points out the limits of static analysis tools in ensuring safe and secure code. Bugs that could have been found by people working carefully could not be found by tools:
“Heartbleed created a significant challenge for current software assurance tools, and we are not aware of any such tools that were able to discover the Heartbleed vulnerability at the time of announcement”.
The only way to find the Heartbleed bug with today’s leading tools is to write custom rules or overrides, which means that you have to anticipate that this code is bad in the first place. You’d be better off spending your time reviewing or testing the code more carefully instead.

If you got bugs, you’ll get pwned

If you have a quality problem, then you have a security problem.

Security and reliability have to be designed and engineered in. You can’t test this in:

Medium- and large-scale systems typically contain many defects and these defects do not always cause problems when the software systems are used precisely as tested…

Even a small system might require an enormous number of tests to confirm correct operations under expected conditions. As systems grow, the number of possible conditions may be infinite. For any non-trivial system, the tested area is small. Test, by necessity, focuses on the conditions most likely to be encountered and most likely to trigger a fault in the system. Test, therefore, can only find a fraction of the defects in the system.

Functional testing proves that the system works as expected. This kind of testing, even at high levels of coverage, can’t prove that the system is reliable or secure. Pen testing, fuzzing, DAST and destructive testing stress the system in unexpected ways to see how the system behaves. But pen testing can’t prove that the system is secure either – for a big system, you would need an infinite number of pen testers on an infinite number of keyboards working for an infinite number of hours to maybe find all of the bugs.

Like any other kind of testing, pen testing gives you information about the quality and completeness of the system’s design and implementation – where you made mistakes, where you missed something. The results tell you where to look deeper for other problems in the design or code, or problems in how you design or how you code. Pen testing is wasted if you don’t use this information to get to the root cause and make things better.

The SEI’s research makes a few things clear:

  1. Security and reliability go hand in hand. Security-critical systems need to be built like safety-critical systems – with the same careful attention to quality.
  2. You can predict how secure your system is based on the total number of bugs that have been found in the code.
  3. Design reviews and code reviews (including desk checking your own code) are the most effective ways to find security and reliability problems. The amount of time spent in reviews is a key indicator of system reliability and security: top performers spent 2/3 as much time in reviews as in development. For security-critical or safety-critical code, you need to get experts involved in doing reviews.
  4. Static analysis testing should be part of everyone’s development program. But don’t lean too heavily on it. Run static analysis before code reviews to catch basic mistakes and clean them up, or to identify problem areas in the code that need to be reviewed carefully. Run static analysis after code reviews to verify that the code looks good. But don’t try to use static analysis as a substitute for code reviews.
  5. Focus on writing good, clean code. Most Level 1 (high severity) defects are caused by coding mistakes.
  6. Train developers in secure design and coding so they know what not to do, and what to look for when reviewing code, and so that they know how to fix security bugs properly.

Building reliable and secure systems isn't cheap and it isn't easy, especially at scale. The SEI says that you must assume that complex systems are never error free. Which means that they will never be completely secure. Our job is to do the best that we can, and hope that it is enough.

Tuesday, January 20, 2015

ThoughtWorks Takes Security Sandwiches off the Menu

Most people in software development have heard about ThoughtWorks.

ThoughtWorks' Chief Scientist, Martin Fowler, is one of the original Agile thought leaders, and they continue to drive new ideas in Agile development and devops, including Continuous Delivery.

At least once a year the thought leaders of ThoughtWorks get together and publish a Technology Radar – a map of the techniques and tools and ideas that they are having success with and recommend to other developers, or that are trying out in their projects and think other people should know more about, or that they have seen fail and want to warn other people about.

I always look forward to reading the Radar when it comes out. It’s a good way to learn about cool tools and new ideas, especially in devops, web and mobile development, Cloudy stuff and IoT, and other things that developers should know about.

But until recently, security has been conspicuously absent from the Radar: which means that security wasn't something that ThoughtWorks developers thought was important or interesting enough to share. Over the last year this has changed, and ThoughtWorks has started to include application security and data privacy concerns in design, development and delivery, including privacy vs big data, forward secrecy, two-factor authentication, OpenID Connect, and the OWASP Top 10.

The first Radar of 2015 recommends that organizations avoid the “Security Sandwich” approach to implementing appsec in development projects, and instead look for ways to build security into Agile development:

Traditional approaches to security have relied on up-front specification followed by validation at the end. This “Security Sandwich” approach is hard to integrate into Agile teams, since much of the design happens throughout the process, and it does not leverage the automation opportunities provided by continuous delivery. Organizations should look at how they can inject security practices throughout the agile development cycle.

This includes: evaluating the right level of Threat Modeling to do up-front; when to classify security concerns as their own stories, acceptance criteria, or cross-cutting non-functional requirements; including automatic static and dynamic security testing into your build pipeline; and how to include deeper testing, such as penetration testing, into releases in a continuous delivery model. In much the same way that DevOps has recast how historically adversarial groups can work together, the same is happening for security and development professionals.

The sandwich – policies upfront, and pen testing at the end to “catch all the security bugs” – doesn't work, especially for Agile teams and teams working in devops environments. Teams who use lightweight, iterative incremental development practices and release working software often need tools and practices to match. Instead of scan-at-the-end-then-try-to-fix, we need simple, efficient checks and guides that can be embedded into Agile development and faster, more efficient tools that provide immediate feedback in Continuous Integration and Continuous Delivery. And we need development and security working together more closely and more often.

It’s good to see pragmatic application security on the ThoughtWorks Radar. I hope it’s on your radar too.

Tuesday, January 13, 2015

We can’t measure Programmer Productivity… or can we?

If you go to Google and search for "measuring software developer productivity" you will find a whole lot of nothing. Seriously -- nothing.
Nick Hodges, Measuring Developer Productivity

By now we should all know that we don’t know how to measure programmer productivity.

There is no clear cut way to measure which programmers are doing a better or faster job, or to compare productivity across teams. We “know” who the stars on a team are, who we can depend on to deliver, and who is struggling. And we know if a team is kicking ass – or dragging their asses. But how do we prove it? How can we quantify it?

All sorts of stupid and evil things can happen when you try to measure programmer productivity.

But let’s do it anyways.

We’re writing more code, so we must be more productive

Developers are paid to write code. So why not measure how much code they write – how many lines of code get delivered?

Because we've known since the 1980s that this is a lousy way to measure productivity.

Lines of code can’t be compared across languages (of course), or even between programmers using the same language working in different frameworks or following different styles. Which is why Function Points were invented – an attempt to standardize and compare the size of work in different environments. Sounds good, but Function Points haven’t made it into the mainstream, and probably never will – very few people know how Function Points work, how to calculate them and how they should be used.

The more fundamental problem is that measuring productivity by lines (or Function Points or other derivatives) typed doesn’t make any sense. A lot of important work in software development, the most important work, involves thinking and learning – not typing.

The best programmers spend a lot of time understanding and solving hard problems, or helping other people understand and solve hard problems, instead of typing. They find ways to simplify code and eliminate duplication. And a lot of the code that they do write won’t count anyways, as they iterate through experiments and build prototypes and throw all of it away in order to get to an optimal solution.

The flaws in these measures are obvious if we consider the ideal outcomes: the fewest lines of code possible in order to solve a problem, and the creation of simplified, common processes and customer interactions that reduce complexity in IT systems. Our most productive people are those that find ingenious ways to avoid writing any code at all.
Jez Humble, The Lean Enterprise

This is clearly one of those cases where size doesn’t matter.

We’re making (or saving) more money, so we must be working better

We could try to measure productivity at a high level using profitability or financial return on what each team is delivering, or some other business measure such as how many customers are using the system – if developers are making more money for the business (or saving more money), they must be doing something right.

Using financial measures seems like a good idea at the executive level, especially now that “every company is a software company”. These are organizational measures that developers should share in. But they are not effective – or fair – measures of developer productivity. There are too many business factors are outside of the development team’s control. Some products or services succeed even if the people delivering them are doing a lousy job, or fail even if the team did a great job. Focusing on cost savings in particular leads many managers to cut people and try “to do more with less” instead of investing in real productivity improvements.

And as Martin Fowler points out there is a time lag, especially in large organizations – it can sometimes take months or years to see real financial results from an IT project, or from productivity improvements.

We need to look somewhere else to find meaningful productivity metrics.

We’re going faster, so we must be getting more productive

Measuring speed of development – velocity in Agile – looks like another way to measure productivity at the team level. After all, the point of software development is to deliver working software. The faster that a team delivers, the better.

But velocity (how much work, measured in story points or feature points or ideal days, that the team delivers in a period of time) is really a measure of predictability, not productivity. Velocity is intended to be used by a team to measure how much work they can take on, to calibrate their estimates and plan their work forward.

Once a team’s velocity has stabilized, you can measure changes in velocity within the team as a relative measure of productivity. If the team’s velocity is decelerating, it could be an indicator of problems in the team or the project or the system. Or you can use velocity to measure the impact of process improvements, to see if training or new tools or new practices actually make the team’s work measurably faster.

But you will have to account for changes in the team, as people join or leave. And you will have to remember that velocity is a measure that only makes sense within a team – that you can’t compare velocity between teams.

Although this doesn't stop people from trying. Some shops use the idea of a well-known reference story that all teams in a program understand and use to base their story points estimates on. As long as teams aren't given much freedom on how they come up with estimates, and as long as the teams are working in the same project or program with the same constraints and assumptions, you might be able to do rough comparison of velocity between teams. But Mike Cohn warns that

If teams feel the slightest indication that velocities will be compared between teams there will be gradual but consistent “point inflation.”

ThoughtWorks explains that velocity <> productivity in their latest Technology Radar:

We continue to see teams and organizations equating velocity with productivity. When properly used, velocity allows the incorporation of “yesterday's weather” into a team’s internal iteration planning process. The key here is that velocity is an internal measure for a team, it is just a capacity estimate for that given team at that given time. Organizations and managers who equate internal velocity with external productivity start to set targets for velocity, forgetting that what actually matters is working software in production. Treating velocity as productivity leads to unproductive team behaviors that optimize this metric at the expense of actual working software.

Just stay busy

One manager I know says that instead of trying to measure productivity

“We just stay busy. If we’re busy working away like maniacs, we can look out for problems and bottlenecks and fix them and keep going”.

In this case you would measure – and optimize for – cycle time, like in Lean manufacturing.

Cycle time – turnaround time or change lead time, from when the business asks for something to when they get it in their hands and see it working – is something that the business cares about, and something that everyone can see and measure. And once you start looking closely, waste and delays will show up as you measure waiting/idle time, value-add vs. non-value-add work, and process cycle efficiency (total value-add time / total cycle time).

“It’s not important to define productivity, or to measure it. It’s much more important to identify non-productive activities and drive them down to zero.”
Erik Simmons, Intel

Teams can use Kanban to monitor – and limit – work in progress and identify delays and bottlenecks. And Value Stream Mapping to understand the steps, queues, delays and information flows which need to be optimized. To be effective, you have to look at the end-to-end process from when requests are first made to when they are delivered and running, and optimize all along the path, not just the work in development. This may mean changing how the business prioritizes, how decisions are made and who makes the decisions.

In almost every case we have seen, making one process block more efficient will have a minimal effect on the overall value stream. Since rework and wait times are some of the biggest contributors to overall delivery time, adopting “agile” processes within a single function (such as development) generally has little impact on the overall value stream, and hence on customer outcomes.
Jezz Humble, The Lean Enterprise

The down side of equating delivery speed with productivity? Optimizing for cycle time/speed of delivery by itself could lead to problems over the long term, because this incents people to think short term, and to cut corners and take on technical debt.

We’re writing better software, so we must be more productive

“The paradox is that when managers focus on productivity, long-term improvements are rarely made. On the other hand, when managers focus on quality, productivity improves continuously.”
John Seddon, quoted in The Lean Enterprise

We know that fixing bugs later costs more. Whether it’s 10x or 100+x, it doesn't really matter. And that projects with fewer bugs are delivered faster – at least up to a point of diminishing returns for safety-critical and life-critical systems.

And we know that the costs of bugs and mistakes in software to the business can be significant. Not just development rework costs and maintenance and support costs. But direct costs to the business. Downtime. Security breaches. Lost IP. Lost customers. Fines. Lawsuits. Business failure.

It’s easy to measure that you are writing good – or bad – software. Defect density. Defect escape rates (especially defects – including security vulnerabilities – that escape to production). Static analysis metrics on the code base, using tools like SonarQube.

And we know how to write good software - or we should know by now. But is software quality enough to define productivity?

Devops – Measuring and Improving IT Performance

Devops teams who build/maintain and operate/support systems extend productivity from dev into ops. They measure productivity across two dimensions that we have already looked at: speed of delivery, and quality.

But devops isn't limited to just building and delivering code – instead it looks at performance metrics for end-to-end IT service delivery:

  1. Delivery Throughput: deployment frequency and lead time, maximizing the flow of work into production
  2. Service Quality: change failure rate and MTTR

It’s not a matter of just delivering software faster or better. It’s dev and ops working together to deliver services better and faster, striking a balance between moving too fast or trying to do too much at a time, and excessive bureaucracy and over-caution resulting in waste and delays. Dev and ops need to share responsibility and accountability for the outcome, and for measuring and improving productivity and quality.

As I pointed out in an earlier post this makes operational metrics more important than developer metrics. According to recent studies, success in achieving these goals lead to improvements in business success: not just productivity, but market share and profitability.

Measure Outcomes, not Output

In The Lean Enterprise (which you can tell I just finished reading), Jez Jumble talks about the importance of measuring productivity by outcome – measuring things that matter to the organization – not output.

“It doesn't matter how many stories we complete if we don’t achieve the business outcomes we set out to achieve in the form of program-level target conditions”.

Stop trying to measure individual developer productivity.

It’s a waste of time.

Everyone knows who the top performers are. Point them in the right direction, and keep them happy.

Everyone knows the people who are struggling. Get them the help that they need to succeed.

Everyone knows who doesn't fit in. Move them out.

Measuring and improving productivity at the team or (better) organization level will give you much more meaningful returns.

When it comes to productivity:

  1. Measure things that matter – things that will make a difference to the team or to the organization. Measures that are clear, important, and that aren't easy to game.
  2. Use metrics for good, not for evil – to drive learning and improvement, not to compare output between teams or to rank people.

I can see why measuring productivity is so seductive. If we could do it we could assess software much more easily and objectively than we can now. But false measures only make things worse.
Martin Fowler, CannotMeasureProductivity

Site Meter