mental_gymnasticsThe portfolio review session is coming up, so you craft a list of questions about project value, cost, and risk that can be answered on a scale from 1 to 10.  Your R&D project teams score their projects, add up the scores for each project, and voilà, you have a ranked set of projects, ready for your meeting. Yet you can’t shake that nagging feeling: Aren’t scoring models useless? Will the scoring process distract from the real goal of building a more valuable product portfolio?

I am here to testify that this bad reputation isn’t entirely deserved. In fact, there are many cases when a scoring model is an appropriate way to assess each project’s contribution to the portfolio. Scoring models have the potential to spark productive conversations among the project team, and they help differentiate projects across the portfolio.

So what is the source of their bad reputation? One recent engagement began with a client’s dismay that their scoring model failed to differentiate across projects. Their model consisted of 12 questions about value and 8 questions about risk, each scored on a scale from 1-10. The client hoped that a distribution of aggregate project scores would look something like this:

They were looking for an ideal set of project scores across two categories, nicely differentiating projects across both dimensions. Unfortunately, this happens less frequently than you might think. After their first attempt at project scoring, what they got was this:
 There were a few projects that stood out with low scores, but most of the projects clustered around 6-9 in both categories. This lack of differentiation made the R&D team feel that completing the scoring model was a waste of time, and that the model was of little benefit to management in deciding on project funding.

In reviewing their scoring model, we identified several areas for improvement. I hope you can apply what we found to your own scoring model.

Are your questions collectively exhaustive?

Have you adequately characterized what you are assessing? If you are measuring “value,” there are many ways to define a project’s value—platform potential, market size, peak year sales, and customer goodwill might all come to mind. Consider which ones you want to include in your model by thinking about which facets of value have a direct link to division or corporate strategy

Are your questions mutually exclusive?

In a well-meaning effort to reduce the number of questions, the client had bundled multiple attributes into single questions. For example, this question assesses competitive advantage:

Are there significant pricing-, quality-, or feature-based advantages to our product vs. competitors?

The above question always scored a 7 or higher because price, quality, and features are all important advantages, and most projects could claim at least one. Unfortunately, including all three attributes in one question confounds the ratings on any one, and differentiating projects across these attributes becomes impossible.  Each question should assess only one concept; if the concept has multiple dimensions, you should write several separate questions to address the different values. On the other hand, be sure that your questions are not redundant: if the results of two questions are highly correlated across your portfolio, do everyone a favor and remove one of them.

Do you have clear anchors for each question?

Does each question in your scoring model make clear what a response of 1, 3, 5, or 7 says about your project? Is a 10 always the most desirable, 1 the least desirable? Is the difference between 3 and 4 the same as the difference between 6 and 7? If 5 is the middle, is it the average, or is it the thing most likely to happen?

Like many teams, this group provided only general guidance in the explanation for each question. For example, here is a how a typical question in their model was worded:

Resource risks: On a scale from 1-10, where 1 is the most risky and 10 is the least risky, express the risk of schedule slippage for this project.

Without defined anchor points, you and I might feel similarly about our projects and yet I might give mine a score of 4 while you give yours an 8. Or you and I might feel similarly about a third project, but I give it a 4 and you give it an 8. We helped the team craft concrete anchors so a 4 to you means the same as a 4 to me, and all projects scoring a 4 in that dimension were on par. Here is how we reworded the above question with defined anchors and concrete anchor points:

Capture the risk associated with missing project milestones due to inability to appropriately resource the project.
Factors outside of project resources should not be considered. Meeting project milestones means completing the project in such time that the delay only has a minor impact on the forecasted benefits and costs.

Factors to consider in assessing whether the project will meet milestones:

  • Do the appropriate resources (e.g. people, equipment) exist to complete the project, either internally or externally?
  • If the resources exist, will the project have appropriate access to them in a timely fashion?

Scoring:1: Very likely; at least a 75% chance of slipping schedule due to lack of appropriate resources
3: Toss-up; less than 50% chance of slipping schedule due to lack of appropriate resources
5: Unlikely; less than 25% chance of slipping schedule due to lack of appropriate resources
8: Very unlikely; less than 10% chance of slipping schedule due to lack of appropriate resources
10: Extremely unlikely; less than 5% chance of slipping schedule due to lack of appropriate resources

With the concrete anchors in place, instead of arguing about what a 4 means in the first place, teams argued about whether a project deserved that 4. This was progress!

Are respondents accountable for their answers?

Will anyone ever own up to a score of 1 or 10?  It takes a lot of time and consideration to construct a clearly worded question, annotated with all necessary information to elicit well-informed, comparable responses. We think it is well worth the effort. With this particular client, rewording the questions and adding anchors led to a more useful (broader) differentiation of scores. However, project teams needed encouragement to use the new anchors; teams had not been held accountable for their responses in the past. Following our involvement, if they responded with an 8 to the above question, for example, they would have to verify how there was only a 10% chance of slipping schedule due to resource scarcity.

Are you playing 20 questions?

Our team’s first scoring model actually did have 20 questions, which was fewer than some R&D teams we talk to. Still, among the 20 were redundant questions that “stacked the deck,” providing opportunities for teams to boost their total score, and also questions that lacked a clear connection to strategy, which did nothing to help management differentiate across projects in a value-driven, strategically-relevant manner. We helped the client clean house and reduce the number of questions to 12.

Sometimes a 10 is not a 10

Scoring questions should be designed to rank goodness (or the lack of badness) for each project, and questions should always be phrased in such a way that a higher score is unequivocally better. However, if the criteria described in the questions isn’t always a good vs. bad thing, scoring may not be the best means of assessment.

This client had several questions about innovation potential. Innovation isn’t universally preferred; innovation can be risky and time-consuming, and a portfolio laden with innovative, risky projects may be unable to meet revenue goals to a reasonable level of certainty. Ideally, some projects in the portfolio should be less ambitious and provide a stable base for sales and product line longevity.

So rather than scoring aspects like innovation, consider treating it as a screening process. Assessing innovation will help you identify which projects are truly innovative, and which are safe but necessary extensions of existing platforms. Then you can compare spending on innovative projects vs. safer bets, and see how this split lines up with your strategic goals. Other project characteristics that are better screened than scored are technology readiness, market readiness, technology platform, division and region.

You’ve built your scoring model: Now what?

Take your scoring model out for a spin with your project teams. Now that you have tuned-up your scoring model, don’t rush to score and rank your projects or run your project prioritization process. Give every member of a project team the task to score their project. Then bring them together and compare their scores. Completing a well-constructed scoring questionnaire is a valuable opportunity for the project team to align on project direction and details; when two team members realize they’ve scored a project very differently, you’ve created an opening for a very productive conversation. It is so productive that we advocate this as the default approach for scoring your projects: Individual project team members should score the project first, and then meet to build a consensus set of scores for the project.

Calibrate the scoring model using the stated project preferences of decision makers. With all of your projects scored, you can assess whether all aspects of value are included in your scoring model by showing the results to your decision makers and asking if they agree with the project rankings. Where and why there is disagreement will be more informative than how much you agree. If mid- or low-scoring projects are highly valued by management, it may be that your questions are not accounting for some important aspects of value.

Use the scorecard as a discussion springboard. Another particularly good use of scoring models involves the identification of key risks or weaknesses so that they can be discussed and mitigated before they threaten a project’s success. Across the range of questions being scored, we work with clients to identify a threshold for specific questions below which the project may need to be rethought or reworked. We call these showstopper scores. Different questions will have different showstopper thresholds, and some questions may not have a threshold at all. The showstopper thresholds ensure that a project with strong scores for most questions but a very low score for 1-2 questions won’t sneak through the funding screen without some consideration for whether the weakness should sink the project.

Compare this scoring model with another one, like “most popular,” “current projects,” “management darlings,” or “net present value.” Use these results to build a nice set of portfolio alternatives.

I hope you’ve read this far, and I hope you’ve found some tips you can use, right away, to clear the reputation of your team’s scoring model. Drop us a line and we will send you a 5-minute/5-question survey to get you started on your next portfolio review.