Nice deck from Dan McKinley of Stripe: Choose Boring Technology

As you grow as a developer (and development leader) and you work with more and more technologies over time in different projects, you start to realize how easy it is for the team to get more focused on the challenging technical problems than the actual product issues. Ignoring the product issues will kill the product (and possibly the company). With limited attention (he calls them innovation credits), it is best to put your effort into innovations that can differentiate your product. All too often, teams get more focused on the next cool technologies, turning everything into a nail as the old saying goes.

Dan McKinley does a great job explaining this in his talk below.

Video of my talk “Apportioning Monoliths”

This was my talk at the Daho.am conference. Listening back to it now, I am struck by how often I said “many, many.” And I cursed! I usually try not to do that. So, it’s a bit of a looser take on this presentation. Luckily the audience had beer (this was in Bavaria, after all), so all were fine with it. I had flown in from Stockholm that morning, so I might have been a bit more tired than I thought…

I was really impressed by the lineup of speakers and the content of the presentations. A really good day. The Stylight engineering and event teams did a great job.

The Spotify Tribe: My talk from Spark The Change last week

The organizers of Spark the Change in London asked me to speak about the Spotify matrix model. I was only too happy to comply. It was a great conference, and I met a ton of good people. As usual, I tend to talk to my slides, as opposed to putting a ton of text on them. Hopefully, you can still get something useful from it.

The Spotify model: how to create, dissolve, and remix teams to be more dynamic and more innovative

One of the most challenging parts of managing a traditional, hierarchical, organization is being responsive to new opportunities; especially those that require leveraging skillsets outside your own team. At Spotify, our organizational model allows us to create, dissolve, and remix teams with a minimal disruption to individuals or managers. This gives us tremendous abilities to address both temporary and long term opportunities.

This post was originally written by request and posted on popforms.com. Special thanks to Kate Stull for requesting the article, and helping me with editing.

 

One of the most challenging parts of managing a traditional, hierarchical, organization is being responsive to new opportunities; especially those that require leveraging skillsets outside your own team. At Spotify, our organizational model allows us to create, dissolve, and remix teams with a minimal disruption to individuals or managers. This gives us tremendous abilities to address both temporary and long term opportunities.

How it used to be

As a manager at Microsoft and Adobe, I was always challenged when there was a problem or opportunity that required repurposing a team or adding on additional scope to an existing team.

This kind of thing comes up all the time: a business development opportunity, or integration with another product. Often, this would require small efforts from multiple specialized teams.

It would cause disruption as those teams had to change their current plans and had to coordinate around a new challenge while still making progress towards their existing goals. Given that people and resources were managed within the team, and managers were still responsible for delivery of their existing commitments, often it would be hard to motivate them towards supporting this new effort.

Creating a new “tiger” team is often the solution in these situations, but that isn’t always an adequate solution for long-term or permanent projects since it essentially punishes the managers of the existing teams and requires finding a new temporary manager for the new team.

Another problem in existing organizations is figuring out what to do with a team whose project has been cancelled.

If the team is a high-performing team you may try to turn the team onto a new problem, which may or may not be a good fit for their skills and experience. You may instead dissolve the team, assigning the members to new teams based on the needs of those teams rather than the preferences of the individuals. You may leave it up to the individuals to find new roles in the company or face layoffs if they are unsuccessful.

These solutions end up punishing both the individuals on the teams and their managers, often for reasons beyond their control. In an organization seeking to innovate (which requires some amount of failure), it sends a counter message to one of experimenting and taking chances.

How we remix teams at Spotify

At Spotify, we wanted to create an organization that allowed us to be dynamic around our staffing, and adaptable in our teams.

We embrace failure as being important to learning and innovation, so we didn’t want dissolving a team to be a punishment. We put this new organizational model into effect over two years ago and have been working with it since. In that time, the technical organization has grown from 250 to over 600 people. We went from having three engineering offices to five, and from having 30 teams to over 70.

We focused on building full-stack, autonomous teams, built around a single, clear, mission. The expectation is that once the team’s mission has been fulfilled that it will dissolve.

To this end, new teams are constantly being created and old teams dissolving, with their members building new teams or moving into existing teams if they need additional staffing. Rather than create a formal manager role for these teams, we decided instead to make the teams collectively responsible for fulfilling their mission.

With this model, changing teams does not mean changing your manager, and dissolving a team doesn’t leave a manager looking for a new role.

We do have a strong belief in role of the manager as mentor to their reports, so we have a strong managerial culture; it just is manifested in a matrix, rather than hierarchical model.

Why Chapter Leads work better than traditional managers

Our technical managers are called Chapter Leads. They are usually responsible for managing a narrow range of developer disciplines within their larger organizations, for example: mobile developers, or backend developers. A Chapter Lead usually has direct reports in multiple teams in the organization.

For an individual, it is common to change teams, but it is less common to change managers. As each team is responsible for their full stack and all platforms, a team may include members from several chapters.

An example is the search team in my organization. Its members come from five different chapters: the backend chapter, the mobile chapter, the keyboard and mouse (desktop and web) chapter, the agile coach chapter, and the test chapter. Additionally, there is a product owner and a UX designer, both of whom are part of the product organization (which is organized more traditionally).

The Chapter Leads are not responsible for deliverables directly. Instead, the Chapter Leads are responsible for staffing the teams appropriately; for working with the individuals in the team to help them grow; and for working with the Product Owner and the Agile Coach to make sure that the team is performing well together.

Since the Chapter Lead has visibility into multiple teams, they can often identify short or long-term skill set needs and are empowered to resolve them.

Sometimes, this means switching two developers in two teams temporarily for a skill set need. Sometimes this means moving a developer into a different team to address a short term staffing need. This also means that if there is a new mission to be addressed, the chapter leads can work together to staff a new team to address that mission out of the existing teams in the organization.

A benefit of this model for an individual is that there are many opportunities for them to work on new projects or develop new skillsets since there are new projects spinning up on a regular clip.

When and how we remix and dissolve teams

This remixing is not constant throughout the technology team. We do have several very long-lived teams that are focused on features in the product, but even those teams will shift people between each other based on short or long-term needs. In some parts of the organization, specifically the infrastructure teams, they tend to be focused on short-term projects and are creating new teams more often. Those teams dissolve when they have completed their project.

We will also dissolve teams if we believe that their mission is no longer necessary. Usually this is the result of the team invalidating their mission themselves. We celebrate these conclusions just as much as the successful completion of the project, since we value the lessons from a “failed” project. Celebrating your failures as valuable lessons encourages risk taking, experimentation and innovation.

By striving towards a model that gives the individual consistency (their manager, and their Chapter) while still giving the organization fluidity and adaptability, we’ve found a happy balance that lets us extend our agile-first values beyond the work that a team performs to the organization as a whole. This has allowed us to focus on innovation and leverage opportunities that slower-moving organizations would have difficulty addressing.

Several companies have attempted to adapt our model but there is something critical to understand. Our organization model itself is fluid and continues to change and evolve to support the needs of the organization. The specifics of our implementation are less important than the underlying values and ideals that created it.

If you want the benefits of a dynamic organization, you will need to build something that is suited to the values of your own organization. I would argue that a central requirement is endowing teams with autonomy and decision-making authority. If you cannot support this, then you should look instead to adapt your existing model to remove impediments and bottlenecks instead.

Thoughts on emulating Spotify’s matrix organization in other companies

I was in San Francisco in December for a conference. While I was there, I ended up connecting with a couple different companies who have been inspired by Henrik Kniberg’s whitepaper on Scaling Agile at Spotify, and who have been trying to implement some of those ideas in their own companies.

I think Henrik’s paper does an excellent job on describing the what and how, but it seems that the “why”, and some of the critical ideas can get lost when others read it.

If you haven’t read Henrik’s white paper, I’d suggest that you read that before reading the rest of the blog post. I will do a quick recap here though.

Spotify’s engineering and product organization (now over 600 people) is split into several large groups called Tribes. Each Tribe is responsible for a set of related features or engineering functions. For example, our largest tribe is the Infrastructure and Operations Tribe, whose name is pretty self-explanatory. I am the Tribe Lead of the Music Player Tribe. We handle importing audio from our label and distribution partners, storing and streaming the music, search, collection and playlists, artist pages, music metadata and the music knowledge graph that supports things like the above, but also ads, discover, radio and the like.

While the whole company works on the same product, Spotify, each tribe is set up so that it can work as independently as possible. As you will see below, a critical aspect of our organizational model is to give autonomy at every level. This helps remove decision-making bottlenecks and unnecessary dependencies, which improves velocity.

Each tribe is composed of squads. A squad is a team that is responsible for a single feature or component. For example, there is a squad that is responsible for search, a squad responsible for the AB test infrastructure, etc. As each tribe is set up to be as autonomous as possible, each squad is also set up to be autonomous. In the context of a feature development team, this means that each team is a full-stack team. A full-stack team is responsible for both backend implementation as well as the user interface implementation, on all platforms.

A typical feature squad would have web service engineers, iOS, Android, web and desktop engineers as well as testers, an agile coach, a product owner and UX designer. With this staffing, the squad has everything they need to implement anything related to their feature. They don’t have to wait on another team to implement the pieces they need. They also have autonomy and local decision-making ability, so there are few impediments on their speed of execution.

To this point with Tribes and Squads described only, Spotify may seem like a traditional, hierarchical engineering organization, but this is where the similarity ends. Unlike a traditional organization, a squad does not have a single engineering leader whom everyone on the team reports to. In fact there is not a single leader for the squad. The Product Owner and UX Designer work with the engineers and testers collectively to make decisions about their features.

Spotify is not a “no manager” culture though. We feel strongly that managers have a role in supporting the people who work for them. Managers have an important role to play as technical and career mentors and organizational communication conduits. Rather than have management hierarchies follow organizational ones (creating a de facto command-and-control structure), we instead have first level managers responsible for technical functional areas across multiple squads.

We call these reporting and functional groupings “chapters.” Again, as an example, reporting to me, the tribe lead, are Chapter Leads. In my tribe, there are currently three backend (services) development chapters, two front-end development chapters (including all the UI developers), a core library chapter, and a test chapter.  These seven Chapters span eight different squads. Almost every chapter lead has reports in 2 squads, and a few of them have reports in three squads. Almost all chapter leads work within a squad in some capacity as well, either as developer or technical lead, and not necessarily within a squad that has members of their chapter.

This chapters/squads matrix organization is critical to our organizational agility. It allows the squads and the tribe to be more fluid. We can spin up a new squad to take advantage of an opportunity or handle an issue without worry about changing reporting structures. If a squad completes its goals and has no reason to exist anymore, we can dissolve it without punishing a manager. This is a very important difference to a traditional hierarchy, because it gives us a lot of flexibility and helps us avoid the old political issues around empire building and resource contention.

In addition to our Tribes, Squads and Chapters, we also have virtual organizations called Guilds. Guilds are cross-tribe organizations centered on different technical or interest areas and their membership is voluntary. The guilds serve as ways to promote cross-tribe collaboration and communication, especially around things like best practices. For example, we have guilds for Web Development, Agile Practices, Leadership, Test Automation, etc. The guilds foster developer-to-developer communication, which is one of the ways that we keep all these autonomous teams from all going off in completely different directions.

From Henrik’s paper, this diagram illustrates the organizational structure I discuss above:

Screen Shot 2013-11-09 at 7.30.08 AM

I’d like to give some more background around why we have implemented this organizational model at Spotify; elaborate on our goals for implementing it, and discuss the aspects of our culture, which have been critical to its success. It is really great that other companies have been inspired by the way we work, but I think if you implement only parts of the model or try to impose it on a very different corporate culture; you will have a difficult time achieving the same level of success with it that we have had.

If you are considering using the Spotify organizational model within your company, there are a few things that will be critical to your success:

Our organization model assumes that engineering is doing development with agile methodologies. Our goals for autonomy mean that we do not prescribe any particular development framework our squads must subscribe to. However, all of squads use agile development methodologies. While we do our best to minimize dependencies between squads and tribes, there will always be some since we are all working on the same product. Any individual squad choosing to build using a traditional waterfall or other non-agile process would not be able to keep up with the rapidly changing teams around them. If they tried to impose some sort of process on other teams so that they could follow a longer-term development plan, they would start slowing down the rest of the organization.

A critical requirement in making our organization model work well is that the entire company works with and understands agile practices and processes. While our legal team isn’t doing scrum or kanban, they are used to working with engineering teams that use agile processes. Having the entire corporation understand and agree with agile means that no line of business area becomes an impediment to the speed of implementation. Think of this in terms of Amdahl’s law, applied to a development organization. If your development teams are working quickly in parallel, but marketing or legal is not supportive of an agile approach, they will become a bottleneck that will slow down the overall speed of the company.

Similarly, implementing this with just one team as a test in a larger engineering organization will be prone to issues. A traditional engineering organization is not usually set up for autonomy. Adding a single autonomous team within that web of dependencies is likely to hamper and frustrate the team and skew the results of the experiment.

While I’ve mentioned autonomy in several places already, I cannot understate its criticality. Each squad must be empowered to make their own decisions, not only on features, but also on development model, infrastructure, and implementation. Every decision that has to be approved outside the team means a delay that slows development. Each dictated implementation or infrastructure decision means that a technology that doesn’t fit to the way the team works or something new that must be learned before the team can build. This is a challenge to coordination, but in practice it isn’t as bad as it might seem. Best practices and technologies do spread from team to team through avenues like guilds. Teams adopt these practices and technologies on their own schedule or pioneer new ways of working if it makes it easier for them to deliver value to our customers and then spread their learnings to the other teams.

Trying to layer the tribe and squads model over a traditional reporting hierarchy would be very problematic. While we have many long-lived squads at Spotify, we are constantly creating and disbanding squads as new needs arise or missions are fulfilled. Squad membership will also ebb and flow as required by the needs of a squad’s mission. Traditional hierarchical organizations are self-perpetuating and restructuring them is very disruptive both to the management chains as well as the individual team members. You would gain some of the benefits of the Spotify model by building full-stack teams in a traditional organizational hierarchy, but you would lose a lot of the overall speed benefits that we leverage with our matrix organization.

In conclusion, if you are looking to improve the speed of your development and are inspired by Spotify’s organizational model, there are a few things that you need to understand. Our model works because it is layered on top of our corporate culture. Our culture values autonomy, agile processes, democratic teams, and servant leadership, amongst other things. You can certainly take some of the ideas from the way we work and apply them in your organization, but without the cultural underpinnings you may not get the same returns.

Some other references worth checking out are Henrik Kniberg’s keynote at the Paris Scrum Gathering and my keynote at the { develop: BBC } conference.

BBC Academy Video on Engineering Culture

This video only works with UK IP Addresses. Sorry about that…

This includes some short clips from my talk at the BBC in November and has several engineering leaders from the BBC talking about their culture. A good, quick intro.

This page has some more information about the speakers: BBC Academy – Engineering Culture

Your Software Will Fail You

Dead formats

I was reading about Jason Kincaid’s issues with Evernote, a piece of software that I am also dependent on (but luckily haven’t had any issues with yet). It reminded me of other software that I depend on that has or is currently failing me: iTunes, and 1Password.

The software I use every day today is different from the software I used every day a few years ago. The software I use in a few years will be different than the software I use today. Through decades of computer usage, I’ve realized that I can’t depend on my software, and that relying on it to exist and to work is folly. As we move towards subscription models for software, this will be ever more the case.

Let me talk a bit about how I work with my software now, in the form of a suggestion list. Some of my friends in the industry think it is surprisingly luddite, but it has reduced my pain over the years significantly. It means less work in some areas (like fixing busted software updates), and more work in other areas (maintaining plan-text backups), but as I moved to having everything digital, it at least lets me feel like my files are somewhat future proof.

  • When you buy digital media, only buy DRM-free. I learned this lesson many years ago. In the early days of digital music there were many competing, protected, file formats. All of them died. If you bought Liquid Audio files, or one of the variants of Microsoft protected audio files, you were screwed. I buy DRM-free eBooks whenever I can (I love that O’Reilly sells their books DRM-free always). When I buy music, I only buy MP3 or Ogg files (this doesn’t seem to be a problem to find any more). TV or Movies are more problematic. I basically only buy those when I absolutely can’t get them any other way, and I assume that at some point in the future, I will have a way to remove the DRM or I will just lose them. If I can’t find a reasonably priced DRM-free alternative, I will sometimes buy the physical copy and rip or scan it instead of paying for DRM-hobbled versions.
  • Only use software that lets you output archival file formats. Evernote let you output all your notes as HTML or XML. I backup my Evernote to these files on a monthly basis. I backup all my Outlook e-mail as mbox (plaintext) files (you can do this by dragging individual folders to disk). 1Password also lets you output as an XML file (although you want to encrypt this somehow in your backups). For the web services I use, I use ifttt to archive them either to Evernote, or Dropbox, or both.

    Creative application files are difficult, but I have also learned this the tough way. I recently had to track down a copy of Adobe Illustrator CS5 because it was the last version of the software that worked with Freehand files. I was a big Freehand user and I never got around to putting my Freehand files into some application-neutral format. Now I make sure that I save a copy of everything in an uncompressed, full resolution, non-proprietary format so that I can get to it again if I need it. This takes up a lot of space, but space gets cheaper. Losing access to something that you spend hours, days or weeks on is worth the cost of a few extra GB.

    I learned in my experience moving from iPhoto to Lightroom how painful it is when you use an application with virtual edits. Periodically, in LR, I output the edited versions of my images at full resolution so I won’t lose those if I can’t use LR anymore. The metadata is another problem, but LR at least save these in the sidecar or image files, so I can reconstruct them if I have to.

    For audio apps, I output the dry and wet stems from each channel when I finish a track so that I can remix later and maybe use them as a guide if I want to try to reconstruct a track from the original files.

  • Back up everything, multiple ways. What is the point of making sure your digital life is future proof if you can still lose it with a hard disk failure? In my case, I use CrashPlan for cloud backup. I also have a set of drives at work that are backups of my home drives. I bring each one home for one night just to back up its pair and the rest of the time it stays at work. In this way, I have three copies of every file. I was also contemplating another NAS backup within my house, but with over 6TB of data, that is a bit expensive for a fourth, redundant copy. I will probably do it anyway at some point.
  • Save copies of the software that you buy/download. Unfortunately, everything these days is still authorized over the internet. This means that if you need to install an old piece of software to read a file that is no longer readable some other way, you might not be able to get it fully enabled, but you still might be able to use it in trial mode for a short period of time, long enough to recover that old file. Having the DMG for CS5 saved my bacon.
  • Only update when you need/have to. This is the most controversial thing, especially because I have always made my living by selling (or renting) software (and upgrades) to people. This has also gotten a lot harder since the rise of the desktop app store and subscription revenue model. For my personal machine that I rely on and have no help or support on, I am very, very careful about when (or if) I update critical software. Before I apply a minor OS update, I always check the support boards to see if there are any issues. I almost never apply a major OS update to my personal computer. I actually can’t think of the last time I did this. The same goes for my critical (as opposed to fun) software. If everything is working on my machine and I’m able to get everything done, I prefer to leave it in that state rather than messing with my machine, possibly screwing myself up. There are a few exceptions. I will always apply security updates, for example. This doesn’t mean that I am always several years behind on software though. I update my hardware on a pretty regular basis, and usually when I do, I update all the software that I am currently using as well. I will still keep my old hardware around for a while, in case I need an old application for something.
  • Keep your files organized. Having everything means you need to be able to find anything. The good part of keeping things in standard file formats means you can take advantage of your OS’s search capabilities, but you’ll still want a reasonable directory structure.

For software developers
It’s easy to ignore old operating systems and backwards compatibility. You can look at your analytics and say “no one uses that feature anymore.” I’ve made that calculation myself many times as an engineering leader. Still, it is worth making sure that your users have an exit or even a staying-put strategy. Especially, if you are building a service or subscription instead of an application. I used to use Gowalla. I put a lot of data into that service. When it went out of business, they put up a page promising a tool to download your data. I thought that was a classy way to go. That tool never appeared, and all that data was lost.

If you want to treat your users right, make them never regret using your software. If you are lucky enough to have your software last for a while, remember all the people who paid you along the way. Treat them with respect, and they will keep paying you into the future.