Fail Safe, Fail Smart, Succeed! Part One: Why Focus on Failure?

This article is about failure and everything I’ve learned from 28 years of failing (and succeeding) in the technology industry. Its basis is my talk of the same name that I first gave in 2015.

I’ve broken it into five parts to make it easier to read and share:

The importance of failure in software development

How we approach failure is critical in any industry, but it is especially crucial in building software.

Why?

The answer is simple: invention requires failure.

We don’t acknowledge that fact enough as an industry. Not broadly. It is something we should recognize and understand more. As technologists, we are continually looking for ways to transform existing businesses or build new products. We are an industry that grows on innovation and invention.

Real innovation is creating something uniquely new. If you can create something genuinely novel without failing a few times along the way, it probably isn’t very innovative. Albert Einstein expressed this as “Anyone who has never made a mistake has never tried anything new.”

In his own words, Thomas Edison says that he created three thousand different theories before he found the right materials for his electric light. To invent his battery, the laboratory performed over ten thousand experiments.

Filmmaker Kevin Smith says, “failure is success training.” I like that sentiment. It frames failure as leading to success.

Failure teaches you the things you need to know to succeed. Stated more strongly: failure is a requirement for success.

Creating a fail-safe environment

To achieve success, what’s important isn’t how to avoid failure; it’s how to handle failure when it comes. The handling of failure makes the difference between eventual success and never succeeding. Creating conditions conducive to learning from failure means creating a fail-safe environment.

In the software industry, we define a fail-safe environment as setting up processes to avoid failure. Instead, we should ensure that when the inevitable failure happens, we handle it well and reduce its impact. We want to fail smart.

When I was at Spotify, a company that worked hard to create a fail-smart environment, we described this as “minimizing the blast radius.” This quote from Mikael Krantz, the head architect at Spotify during that time, sums up the idea nicely: “we want to be an internal combustion engine, not a fuel-air bomb. Many small, controlled explosions, propelling us in a generally ok direction, not a huge blast leveling half the city.”

So, let us plan for failure. Let’s embrace the mistakes that are going to come in the smartest way possible. We can use those failures to move us forward and make sure that they are small enough not to take out the company. I like the combustion engine analogy because it embraces that failure, well-handled, pushes us in the right direction. If we anticipate, we can course correct and continue to move forward.

One way you can create these small, controlled explosions is to fail fast. Find the fastest, most straightforward path to learning. Can you validate your idea quickly? Can you reduce the concept down so that you can get it in front of real people immediately and get feedback before investing in a bunch of work? Failing fast is one of the critical elements of the Lean Startup methodology.

A side benefit of small failures is that they are easier to understand. You can identify what happened and learn from it. With a big failure, you must unpack and dig in to know where things went wrong.

The Lesson of Clippy

Even if you’ve never used the Office Assistant feature of Microsoft Office, you are likely aware of it. It was a software product flop so massive that it became a part of pop culture.

I worked at Microsoft when the company created Office Assistant. Although I didn’t work on that team, I knew a few people who did.

It is easy to think that the Office Assistant was a horrible idea created by a group of poor-performing developers and product people, but that couldn’t be farther from the truth. Extremely talented developers, product leads, researchers with fantastic track records, and PhDs from top-tier universities built Clippy. People who thought they understood the market and their users. These world-class people were working on one of (if not THE) most successful software products of all-time at the apex of its popularity. Microsoft spent millions of dollars and multiple person-years on the development of Clippy.

So, what happened?

What happened is that those brilliant people were wrong. Very wrong, as all of us are from time to time. How could they have found their mistake before releasing widely? It wasn’t easy at the time to test product assumptions. It was much harder to validate hypotheses about users and their needs.

How we used to release software

Way back before we could assume high-bandwidth internet connections, we wrote and shipped software in a very different way.

Software products were manufactured, transcribed onto plastic and foil discs. For a release like Microsoft Office, those discs were manufactured in countries worldwide, put into boxes, then put onto trucks and trains and shipped to warehouses, like TV sets. From there, trucks would take them to stores where people would purchase them in person, take them home and spend an afternoon swapping the discs in and out of their computers, installing the software.

With a release like Office, Microsoft would need massive disc pressing capability. It required dozens of CD/DVD plants across the world to work simultaneously. That capability had to be booked years in advance. Microsoft would pay massive sums of money to take over the entire CD/DVD pressing industry essentially. This monopolization of disc manufacturing required a fixed duration. Moving or growing that window was monstrously expensive.

It was challenging to validate a new feature in that atmosphere, peculiarly if that feature was a significant part of a release that you didn’t want to leak to the press.

That was then; this is now.

Today, the world is very different. There is no excuse for not validating your ideas.

You can now deploy your website every time you hit save in your editor. You can ship your mobile app multiple times per week. You can try ideas almost as fast as you can think of them. You can try and fail and learn from the failure and make your product better continuously.

Thomas J Watson, the CEO of IBM from 1914 until 1956, said, “If you want to increase your success rate, double your failure rate.” If it takes you years and millions of dollars to fail and you want to double that, your company will not survive to see the eventual success. Failing Fast minimizes the impact of your failure by reducing the cost and delay in learning.

I worked at an IBM research lab a long time ago. I was a developer on a project building early versions of synchronized streaming media. After over a year of effort, we arranged to publish our work. As we prepared, we learned there were two other labs at IBM working on the same problems. We were done, it was too late to collaborate. At the time, it seemed to me like big-company stupidity, not realizing that three different teams were working on the same thing. Later I realized that this was a deliberate choice. It was how IBM failed fast. Since it took too long to fail serially, IBM had become good at failing in parallel.

Part Two: Building a Fail-Safe Culture

Manny Vellon

Manny
Manny

If you are lucky in your career, you will have a few good bosses. They are people who inspire you and teach you how to be a better developer, manager, or person.

Manny Vellon was my first boss at Microsoft. Since leaving college, I had a string of good jobs, but not the best managers. I was a bit raw and somewhat guarded by my experiences.

I was the third person to join the team. There was our Director, Manny as Development Manager and me, so for the first few years, I got to work very closely with him. He had already been at Microsoft for several years in the Developer Tools team, so he had survived and thrived in a callous and competitive culture.

At first, I just respected his programming skill and knowledge. We were building the initial code together. I was amazed at the effortless way he would jump down into the assembly when he needed to understand why some bug was happening.

Once we started making more progress and started meeting with other teams, I was blown away by how he handled the often-tense situations.

Microsoft in the mid-90s was still in its heyday of competitive culture. Disagreements were handled by being louder, making threats, or sneaky political moves to undercut other teams.

In these settings, Manny was the vision of calm confidence, transparency, and good humor. If this didn’t diffuse the situation, he would calmly take apart whatever PM or DM was threatening our team or pounding their fist on the table. They would be left trying to maintain their dignity and backtrack as quickly as they could. He wasn’t cruel or mean. He was firm, he was interested in what was right and would accept no less.

As soon as one of these meetings ended, Manny would be right back to his jovial, wise self.

He was transparent, but not in an obvious way. It was just who he was. He didn’t feel the need to guard information. He knew that I could do my job better if I had the complete picture.

He pushed me to be better, to be more ambitious in my goals. He modeled those expectations himself. If we had a deadline, he was always there, with the rest of the team. Doing whatever he could to push us to hit our commitment. If I got something done but could have done it better, he would challenge me to take it to the next level. Always with humor. He made me feel like it was important to him that I grow. He considered that responsibility as my manager seriously.

A lot of who I am as a leader today comes from the lessons he taught me and what I learned from watching him work. Anyone that has worked with me since then has heard me tell a Manny story or three.

Manny Vellon died on May 27th while hiking.

I had lunch with him a couple of years ago, and I told him how much he meant to me. I am very grateful that I did that. I wish I had kept in better touch with him over the years. I know that there was a lot more I could have learned from him as he moved from Microsoft into starting his own companies and being a CTO.

My deepest condolences go out to his family and friends. There has been a massive outpouring of stories and emotions from the people he touched over the years. My own experience is hardly unique.

In the Seven Habits of Highly Effective People, by Stephen Covey, he asks you to imagine your funeral. What will people say about you? What would you hope that they would say? I hope that Manny would see how he positively touched the lives of so many and be content.

My intent with writing this is not just to tell you about a beautiful and inspiring person but also to charge you with that kind of influence on others.

If you are a manager or leader, the behavior you model, and the lessons you impart can change the direction of the people around you. Positively and negatively. What are you modeling? What are you teaching?

If there was someone like this in your life, a teacher, manager, mentor, or friend, tell them. You will be glad you did, and it will mean a lot to them.

The challenge of top-down change and the Microsoft lay-offs

In my talks on engineering culture, I usually like to spend a bit of time talking about how to improve an existing culture or fix one that is truly broken. To create true culture change, I advocate for a bottom-up approach.

I have a few reasons for this:

  1. My audience is frequently made up of individual contributors or first level managers. I want to give them tools they can use to affect change in their larger organizations.
  2. Bottom-up change takes longer, but it is more likely to be truly transformative. It has a better chance of long term success because the whole organization is invested in it.
  3. When cultural change (or any kind of disruptive change) is pushed from senior leadership down, it tends to fail because the middle managers have usually attained their position by being successful in the old culture. This makes them less likely to embrace change and more likely to only go through the motions while actively managing-up to make it seem like they are more active.

Almost every time I advocate this bottom-up approach, I get a question asking if top-down change can also be effective. Sometimes this comes from a senior executive looking to lead change in the organization. Often this question comes because there are two high-profile large companies in the industry trying to change their cultures in very public ways: Yahoo and Microsoft.

When Steve Ballmer was promoting his “One Microsoft” plan, I would make the claim that the chance of that succeeding was nearly zero for reason #3 I mention above. Having worked at Microsoft in the 90s and early 2000s, I know the culture that many of the current Microsoft executives and middle management rose up through. Microsoft spent decades building a highly competitive culture. A restructure and top-down initiatives to encourage collaboration was unlikely to reverse decades of competition.

I pointed to the approach that Marissa Meyer was talking at Yahoo as having a better chance of success. Yahoo was implementing new review policies that seemed harsh to many in the company. This was coupled with “silent layoffs” and a significant effort to eliminate people in the company that were not interested in the new culture that the company was trying to create. While this seemed unreasonably severe, it made a clear point: this was the new culture and the old way of working would no longer be tolerated.

In the memo that Satya Nadella sent to Microsoft outlining the layoffs that he was undertaking today, one section caught my eye:

In addition, we plan to have fewer layers of management, both top down and sideways, to accelerate the flow of information and decision making. This includes flattening organizations and increasing the span of control of people managers. In addition, our business processes and support models will be more lean and efficient with greater trust between teams.

This was a stark difference to Steve Ballmer’s approach to culture change. Coupled with the largest layoff in the companies’ history was a clear message that a central target was the company’s management. This serves to underline the seriousness of the change. Flattening hierarchies and removing managers will also eliminate or weaken those who would be most likely to fight the cultural shift.

I think this has a much better chance for success than the One Microsoft approach, but is still not guaranteed. Changing the way that 100,000 people approach their jobs is an insanely difficult task, after all.

The “house cleaning” approach may be a successful tactic in affecting cultural change in a large organization, but it is also very dangerous. The morale implications are significant. It would be most effective in a “do or die” situation where a drastic action is necessary to save the company itself.

There is an argument that Yahoo was in this position when Marissa Meyer joined the company. The aftermath of the shakeup was a feeling of confidence in the future and hope rather than fear or concern from the folks I know there.

Microsoft is not in a dire situation. While many in the industry and the press look at the company as sliding into irrelevancy; it is still amazingly profitable. This radical restructuring combined with layoffs may be greeted with significantly less enthusiasm from the employees. Satya Nadella may be taking advantage of his honeymoon period here, and that may be the thing that saves this.

I’m going to continue to follow the progress of both of these leaders and companies as they try to evolve. It will be fascinating and instructive.

I hope for the Microsoft and Yahoo employees’ sake that they are successful.

Truth in advertising and the new MS Surface Commercial

Screen Shot 2013-10-24 at 11.59.05 AM

I just saw a MS Surface commercial where someone used it comfortably on an airplane tray table. They must be in mega first class because I’ve seen people try to use them on “real” tray tables. It’s hilarious. The keyboard sticks out over the too-small space between your body and the tray table, and the backend comically and continuously falls off the other edge.

The kickstand was literally the stupidest thing on the first version of that product. It was fine if you wanted to watch a movie, but it wasn’t even at a good angle for that most of the time. With the Surface vertical you can’t type on it, although with its’ weird aspect ratio, you can’t comfortably type on it anyway. Since the device wasn’t really useful without the keyboard, essentially you ended up having a laptop without a hinge. That laptop hinge has survived for decades for a reason. The reason is that it works, and it works well.

Try and use a Surface on your lap. You can’t type on the screen, and you need to be nearly horizontal (or amazingly long limbed) to even fit it with the keyboard on your lap. Did Microsoft only test this on tables? Just bad, bad, bad design.

With the second version of the Surface, they kept the kickstand, but they are now marketing it as a device for doing work instead of entertainment. Now the design is even stupider. The kickstand on the surface2 seems to have two positions, which is a slight improvement, but it is still worthless without a keyboard, and it still won’t fit on a tray table or your lap.

I can’t believe they are doubling-down on this.

For the record, at one time I had TWO surface RTs. I had my company purchase one for me when they were first launched. I seriously tried to use it and gave up after a couple weeks of frustration. The second was given to me by Microsoft when I attended the Microsoft MIX conference. That one I never took out of the box and eventually gave it away since I knew I would never use it.

Not cool, Microsoft

I got a message that my developer subscription for Windows Phone App Store was about to auto-renew. Well, I had created the subscription on a whim (when it was $8 during last years’ Windows Phone 8 promotion), but I hadn’t ever done anything with it. So, I decided to cancel. Thanks to MS for reminding me.

I go to log in to account. I see the subscription like, I click it.
I see my subscription and it’s renewal date.
Screen Shot 2013-10-06 at 10.27.09 PM

I click the link to Manage My Subscription.
Screen Shot 2013-10-06 at 10.27.22 PM

What the hell? No option to cancel? Ok, I click on the “How can I cancel or renew my service?” link, and I get:

Screen Shot 2013-10-06 at 10.27.44 PM

Are you serious? This is really lame.
I click the drop-down and these are my options:

Screen Shot 2013-10-06 at 11.22.25 PM

Well, that doesn’t seem right. However, there is a Windows Phone choice… I click the Windows Phone option, and I get:

Screen Shot 2013-10-06 at 10.28.22 PM

Which is end user support and is completely not what I need. After finding a link for Developer Support and going to their forums, I immediately find someone else who asked the same question. The answer? File a support ticket. Really. A support ticket. To cancel my subscription? That is beyond lame. That is a roach motel kind of tactic used by shady startups trying to lock in subscriptions, not by one of the most profitable companies in existence. Repulsed, I decide instead to just remove my credit card from my account. They can’t auto-renew if they don’t have my card, right? Wrong.

Screen Shot 2013-10-06 at 11.28.02 PM

That is right. Microsoft won’t let me remove the credit card from my account because it is tied to the subscription that I don’t want to auto-renew. Instead it tries to force me to give it another payment option:

Screen Shot 2013-10-06 at 11.28.17 PM

This is insane! I’m sure that from the perspective of the PMs and the Developers on the project, this may have made some sense. There might have even been big debates about it. In the end though, someone made the decision that they would force you to go to Technical Support to cancel your subscription. I doubt that they specifically made it difficult to actually go to Technical Support on purpose. I trust by the axiom “Never attribute to malice that which is adequately explained by stupidity.” I’d like to think that the fact that the link to Technical Support effectively sends you to the wrong place and the fact that Microsoft will not allow you to have no payment method on file if you have an active, renewing subscription as two independent, bad, decisions. If you combine those bad decisions with the deliberately EVIL (yes, I went there) decision to force you to go to technical support in order to cancel your subscription, you get a colossal FU to a developer who might otherwise return at a later date. I’ve heard on the internet that other subscriptions are even harder to unsub:

This should be a lesson to anyone building services, especially subscription services. Don’t be stupid and don’t be evil. Make it as easy to get out of your service as you do to get into your service. Make it trivial to export your data. Make it easy to cancel your subscription. Otherwise, you turn folks who may have been indifferent into folks who actively dislike you (what will that do to your Net Promoter Score?) You turn customers who might have otherwise returned at a later date into people who actively tell (or blog!) to their friends to avoid you.

The exit funnel should be nearly as critical as the entrance one. Also, you should actually test that workflow. Again, I’m going to give the benefit of the doubt to Microsoft here. Maybe they didn’t really try to test this workflow like a “normal” person, but given the number of people that work in this area, that really isn’t an excuse.

For those that came across this post by Googling (or Binging) a solution to the un-subbing problem, here is a link to the actual developer support forms so that you can unsub yourself: https://getsupport.microsoft.com/default.aspx?supporttopic_L1=32136142&locale=EN-US&supportregion=EN-US&ccfcode=US&mkt=EN-US&pesid=14879&oaspworkflow=start_1.0.0.0&tenant=store&ccsid=635166677403542430

skype nightmare

someone is faking my skype # for robocalls. So I get a dozen people *69ing every hour. Some leave angry voice mails.

I never used it anyway, so I just cancelled the skype number subscription, thinking that it would actually CANCEL MY SUBSCRIPTION. Except Microsoft won’t cancel it until the subscription runs out. IN NOVEMBER. MS customer support never replied to my messages.

Will probably need to create a new Skype account, which is lame.

Running a phone service is hard, running an IP Telephony service is harder. I expect the same level of support that I would get from a telephone service provider, but I also expect that I should have complete control and access, just like any web service. Unfortunately Skype is doing neither in this case.

On Microsoft’s new structure

http://www.bonkersworld.net/organizational-charts/
http://www.bonkersworld.net/organizational-charts/

Microsoft finally unveiled the new much-rumored organizational plan. Glad to see Microsoft moving audaciously. This is long overdue.

However, knowing that organization, I don’t know if there is much chance that it will be successful. The whole organization has been set up to compete with each other for decades. This kind of cultural change is probably beyond what is possible at this point. The battle lines are too well established, the rivalries too set in stone.

The culture of Microsoft has always been one of intense competition. Successful individuals and managers rise more on their ability to outshine their peers rather than cooperate. A new high-level alignment or a single memo will not change that. If Microsoft really wants to be nimble and more collaborative, they need to clear house.

Furthermore, organizing engineering as massive silos that are parallel to the other massive silos representing other business functions is exactly the wrong way to do this. Every new effort will require coordination between massive groups with conflicting priorities, politics and agendas. Everything will be harder. The company itself is so massive that having responsibility for the success meet at the tops of these tall functional mountains will not be sufficient to make these efforts work. The people with responsibility will be too far away from the details to be effective. Layers upon layers of management (each with their own goals, agendas and success metrics) will need to be navigated to get any level of cooperation.

It’s going to be a tough few years for the employees at the company. For the front-line engineers, their day-to-day work will probably not change much, but at the higher levels, there is going to be tremendous pain as the new structure and corresponding power battles work themselves out. In the end, I expect very little will change on the inside, or the outside.

I’d be delighted to see Microsoft prove me wrong.

Good Review of Windows 8 Usability

This sums up a lot of how I felt when I first starting using Win8 on a non-touchscreen laptop. It was even more painful when using it through a virtual machine. I’ve found it a bit better when using it on a surface, but there are still so many weird UI choices and odd edges that are continually ruining the experience. I can actually use it to do things, I just don’t like to now. Microsoft really needs to fix this.

What I’m trying to do in Outlook

So, in my work e-mail, I get around 200 messages a day. I periodically get myself back down to inbox zero, but if I take a day or two off, I immediately get behind. I recently decided on a new mechanism for sorting my incoming mail. First off, would be to divert any mail not sent directly to me (where my name isn’t on the to or cc line) into a separate folder. This would be the stuff I would get to when I had time. Next would be to divert mail where I’m CC’d into a separate folder (this is the mail I’d read after reading my inbox), all mail with me on the TO line would be left to filter to my inbox. This way, I think I could make sure that I’m not losing the important messages in the noise of the stuff that I don’t need to read (but will when I have time).

Unfortunately, Outlook’s rules don’t let me do this. I can create a rule for messages where my name isn’t on the “To” line, and I can create a rule for messages where my name is on the CC line, but then messages where I’m in the CC line get put into two different folders because they aren’t mutually exclusive. Since the rules in Outlook are more or less fixed, there doesn’t seem to be a way to do what I want here.

Any suggestions (other than get a real mail program)?

Section 3.3.1 is not new behaviour from Apple

[disclaimer: I am an Adobe employee and an Adobe and Apple shareholder, my opinions are my own and not those of my employer.]

Like the rest of the software industry, I’ve been pondering what the effect section 3.3.1 of the iPhone 4.0 SDK will have. I had fully been planning to make an iPhone application at some point. I had planned to do the initial version with Flex to prototype, but then also spend time doing a Cocoa version to better learn that SDK for myself. This iPhone 4.0 SDK announcement honestly has me questioning if I do really want to develop for the iPhone. Not just because of a higher-minded sense of indignity at Apple’s lack of openness of their platform, but rather because of that combined with their somewhat arbitrary and opaque app store approval process. Could I spend months of my spare time learning ObjectiveC and working on an iPhone application only to have that time be a complete waste if the App store reviewers decide that they don’t want that app in the store?

Thinking about it this morning, I realized that not only was Apple’s move to lock in developers nothing new, but that I’d already written about it before (in fact, I’ve been blogging about it since almost the day I started doing professional development for the Macintosh): iPhone SDK: The carrot for Cocoa, the stick for Flash, The difference between being an Apple developer and a Microsoft developer, Developers Developers Developers Developers.

Gruber had the motivation right, I think, but I also think he got the ramifications wrong. Since Steve returned to Apple, they have been applying the screws tighter and tighter to their developers, trying to get them to lock in. It was somewhat indirect at first, but the long term implication was clear: “We’ll tell you how to develop for our platform, if you do as we say, then you’ll be fine. If you don’t do it the way we tell you, your life will be a never-ending stream of headaches.” The move to Intel (forcing all developers onto X-Code and a big rewrite of any PPC-assembly) was step one, the move to 64-bit (dropping support for Carbon after promising it) was step two. The iPhone 4.0 SDK is just the most obvious move in this process because it basically spells it out. You no longer have a choice: it is Apple’s way or the highway. The problem is the App store. On the Mac, I control my own distribution. On the iPhone platform, Apple does. That means that they no longer have to negotiate with their developers, they can now finally dictate to them.

As a developer, this makes the iPhone platform a lot less attractive because I also can’t be sure that they won’t change the terms again. Once I’m locked in, I’m locked in. Apple can do whatever they want and I’m forced to rewrite my apps or get forced out. As someone who writes software for a living, this scares the crap outta me.

Here are some other blog posts that I thought were good reading around this:
The iPad isn’t a computer, it’s a distribution channel (O’Reilly Radar)
Five rational arguments against Apple’s 3.3.1 policy (37 Signals blog)