Categories
Experience Reports

Collaborative Test Strategy

I’ve always disliked writing test strategies and plans. Reviewing them was even worse. Just tedious long documents that tell me very little. Usually almost a copy paste as projects tend to be pretty similar. I did play with the one pager but still, it felt like a pointless exercise. We had a ways of working that incorporated testing.

In fact, inspired by Robbie Falck, I did our test strategy as a ways of working. That was well received by the teams but there was a push from the business to have documented test strategies per epic.

Not the WoW I used but we did may the various stages of a story and the activities performed.

I ended up taking inspiration from the one pager and organised a meeting with the team and we filled it in. I then carried and mixed it up. Eventually I finally started seeing the value. It wasn’t the document. It was still as pointless as ever. The value was in the conversations we had, the risks identified and the outcomes of the discussions about what we’ll need to do.

I liked doing this in phases. In our first session we typically started from a diagram of the system. What are we changing? What is impacted? What is the technology? (although later in my time I started by asking… what is the problem we’re solving). I’d also try and get a feel for what we knew and didn’t know. We’d ask about API changes – do we need to do threat modelling? Finally if there’s barriers to testing the feature (kit, environment etc), let’s highlight those early.

Deliberately blurry – but there’s a model of the system, discussion points and then some notes on key questions we want to ask.

I could then catch up with the team, or a couple of folk, again and ask what new have we learnt? What possible risks are new and what have we progressed on potential risks from our first chat. This is again focused in a collaborative way. We should know even more about the architecture so now I can tap into performance & load testing as well.

Whilst I evolved my templates for facilitating, I did explore different methods. It depended on the feature and our knowledge. I loved a diagram but having a series of prompts to ask questions or a mind map of SFDIPOT, it varied. This was to try and get us asking some slightly different questions and keep things fresh. The point is the discussion, not filling in a form, which leads to copy-paste strategies.

A mix of approaches

In terms of planning *how* to test everything, we focus that per story. If we identify dedicated testing activities, they are their own story. We shouldn’t need a document saying that we’ll do performance tests and unit tests. They are part of the definition of done or acceptable criteria.

So I’m happy to do away with the test strategy documents. They are still worthless in my view. However facilitating discussions involving the various team members to identify the risks and challenges we’ll face. Then documenting the testing needs through the usual tickets.

At the end of the day, if we’re trying to shift left then why have distinct documents about testing. Instead, yes let’s talk about the testing and intertwine that with what is required to close a story.

Categories
Experience Reports Guide

Meaningful RCAs: Documenting the results

So far I’ve written a few blog posts around conducting RCAs where I’ve focused on the people and questions. However what I’ve yet to touch upon is the documentation side.

In a similar idea to the concept that the activity of coming up with a test plan is more important than the document itself, I have similar thoughts with the RCA. With this in mind, the most detailed document that I’d have is the collaboration board that I’ve used to facilitate the discussion. It captures our thoughts, discussion and key point.

A screenshot of a board created in Mural containing several sections loosely related to the SDLC and a number of different sticky notes
Example of a RCA, although this is all lorem ipsum text as I obviously can’t share a real one!

After the session I will then (as soon as possible) write up the overview. This is to capture the key findings from the RCA, explaining the nature of the problem, what we’ve learnt, any actions and so forth. This is shared with the team(s) on Slack to have a first look before I’d share it more widely.

I did like keeping a spreadsheet with my RCA findings. It would include the summary, a link to the board & tickets and an overly simplified “category” (missed requirement, domain knowledge, coding error etc).

This category is useful for metrics to help us understand patterns. This was useful when I was pushing to drive new initiatives because I could say “if we’d been using examples in refinement, we wouldn’t have had these massively complicated bugs”. If I’d had more time with my former employer, I’d have loved to explore a means of saving RCA summaries where I can tag the RCAs with different things to help demonstrate patterns.

I had also dabbled with feeding this data to an AI agent (one where we’d got the legal protections that it wouldn’t feed back into the main models). This was quite neat… but a topic for another day…

One final note is that I am aware that most people would still prefer to use a more formal & structured documentation approach than myself. I get that. Some of the things recorded could I guess be useful. However I’ve yet to experience any time where a 2 page document is useful. I have found these RCA discussions really useful and subsequently my documentation approach is similar to my retro approach. It is collaborating & capturing a conversation.

If you’d like to read more on RCAs, check out the collection of my posts on the Meaningful RCAs page!

Categories
Experience Reports Guide

Meaningful RCAs: Structuring questions

I’ve already talked about how we need to tap into unleashing our inner toddler by asking “why”. But what questions do we ask?

Background

Before getting into the guts of the RCA I like to go through the background. This is partly to act as a refresher for everyone as it may have been a few weeks but also it will help guide me in my questioning.

This usually means sharing:

  • Links to the defect we’re RCAing & the original ticket
  • Links to PRs to fix the issue and where possible the original (“offending”) PR.

Then asking:

  • Can you describe the problematic behaviour? (i.e. what was actually wrong from a user’s point of view)
  • Can you describe the describe the nature of the code fix?
  • What do you remember from working on the story?
    • How long did it take?
    • How many people were involved?

The Fix

Before learning more about why the issue came to be, let’s make sure that we’re confident in the fix. I like to ask two questions here:

  1. How resilient is the fix?
  2. Will we know if the behaviour regresses again? (i.e. did you add automated tests)

Quality Engineering Throughout The SDLC

Now we get into the real important questions. This is where we go through the software development life cycle and think about what we did and whether there were opportunities to (realistically) catch it then.

First of all, if this was an escape, lets ask if we could have caught it in production (e.g. monitoring), release testing or epic close off testing. I wouldn’t advocate for just asking “could have have caught it here?” but asking around what the process is, what was the testing performed and is this something in the scope of what we’d usually test?

We then move on to the story within the sprint, starting with testing of the original story / bug. We’re trying to understand whether this was a brain fart (it happens) or is it just something that we wouldn’t usually consider testing? If not, why not?

Then we get into more technical. We’re looking at the PR, starting with code review. I’ll be asking about the nature of the bug and is that something that we’d look for? I’d want to understand whether SMEs were involved & if not, why not? Did they check the testing notes & automated tests in the code review? Code reviews aren’t ever going to catch everything but it is good to discuss this process. It is a nice chance for people to get to talk about the value and role of a code review too.

I then concentrate on the developer’s testing. What had they covered through automated and hands on tests? How much was iterative? As a former dev, I know all too well how even a well intended developer who tests their work can let things come through here (see dev BLISS).

We’re back then to technical discussions on the code. This is where I hope the architect can ask a few questions, although regularly other team members often chip in. This discussion is a great way for the team to learn from each other.

You might think that now that we’ve talked about the types of testing and the development challenges that we may stop there, but no we don’t!

The teams will have planning and refinement when we’re breaking down the story. We do test strategies and planning at epic and sometimes user story level. We think about the complexity of the code work with architectural studies before starting an epic. Let’s continue diving into these.

Again we’re asking about what was done, whether this is a scenario that could have been caught, either behaviour wise or in code, and tapping into what more we could have done. This helps us with spread left.

A Parting Question

Near the start I asked about our confidence in catching this issue again. Unless we’re running out of time (unfortunately often), I like to ask a similar but slightly wider question. How confident are we that we won’t see a repeat of the issue? Not necessarily the same issue but a similar one.

Summary Section

Finally I’ll have a summary section with actions, learnings and a summary of the RCA. Often written up afterwards because unsurprisingly the hour I book for RCAs isn’t always enough to cover everything in this post! I’ll explain a little more on this in a separate post.

So in short…

We start off by discussing the background of the story to refresh ourselves and help us get an idea on what threads are best to pull on as we go into things. We’ll also check we’re confident in the fix.

We then take our time going through the SDLC. We’re not just asking “could we have caught it?” or “why didn’t we catch it?” but looking at the actions, steps and processes to understand the answer to this.

I switched the ordering from starting with the first stages of the story to starting in prod after advice from a great chap called Stu Ashman. I found this got us much more engagement in some of the testing and activities around post release. You’ll also see how through the different stages we are asking slightly different questions to consider more than “why didn’t we catch it?”.

We’re using every stage as a learning opportunity.

… and that makes for a meaningful RCA!

Categories
Experience Reports Guide

Meaningful RCAs: Involving the right people

I love collaboration and making exercises something that people can engage with. It is usually the discussion that matters more than what gets written on paper. For this to be successful, you need to have the right people in the (virtual) room.

As we’ve touched upon already, the RCA should touch upon all areas of the lifecycle of the source of the defect. Consequently I’d invite:

  • At least one person involved in refinement
  • The developer for the original story/defect
  • The code reviewer for the original story/defect
  • The tester for the original story/defect
  • The developer who fixed the defect that we’re doing the RCA for
  • An architect, even if they’ve no involvement before (arguably better). Failing that, a team lead.
  • Optionally any other team members.

I would have liked to invite a PO to some but I never got quite that bold.

There’s two things to highlight here.

First is that we’re focusing on who was involved when the defect was introduced. We have insight from the person who understands the fix but it is the processes, decisions and challenges in that original issue that we want to understand.

Secondly, with the architect and myself we have a cracking blend of insight. There’s someone who can analyse the code, design and technical side & ask meaningful questions and I can look at testing, process and examine ways of working.

For this to be successful you need all participants bought into the idea of being a safe place & no blame to be placed. I’ve written about this previously.