Ads
  

COCOMO : A Regression Model

COCOMO : A Regression Model

The COnstructive COst MOdel (COCOMO) is perhaps the most extensively used estimating technique. It is a regression-based model developed by Dr. Barry W. Boehm when he was at TRW in the early 1970s. He began by analyzing 63 software projects of various types. The projects were observed for actual size (LOC), actual effort expended, and actual schedule duration. Regression analysis was then used to develop exponential equations that best explain the relationship between the scattered data points.

Regression Models in General

A regression model is derived from a statistical interpretation of historical data to describe a mean or "typical" relationship between variables.

Definitions:

Regression analysis is:

●  The use of regression to make quantitative predictions of one variable from the values of another.

●  A statistical technique used to find relationships between variables for the purpose of predicting future values.

●  Procedures for finding the mathematical function that best describes the relationship between a dependent variable and one or more independent variables. In linear regression, the relationship is constrained to be a straight line, and least-squares analysis is used to determine the best fit.

Linear models are: Statistical models in which the value of a parameter for a given value of a factor is assumed to be equal to a + bx, where a and b are constants. The models predict a linear regression.

In many regression, the dependent variable is considered to depend on more than a single independent variable.

Figure (a) shows a scattergram on the left and an attempt at fitting a line through the data points on the right. In this example, cavities are expected to increase with grams of sugar consumed.

Scatter Plot, Fitted Line

COCOMO Modes

Boehm plotted his observed 63 projects, but because of the amount of scatter in the data, he developed many equations in an attempt to fit a line for use in future predictions, as illustrated in Figure (b). The technique that he used, and his advice to those who want to repeat the experiment, is to keep the equation simple and then apply additional explanatory factors later. Complex statistics at this point will not gain much because the data is normally noisy anyway.

Boehms Original 63 Projects

COCOMO refers to three modes, which categorize the complication of the system and the development environmqent (see Table 1).

Organic

The organic mode is typified by systems such as payroll, inventory, and scientific calculation. Other characterizations are that the project team is small, little innovation is required, constraints and deadlines are few, and the development environment is stable.

Semidetached

The semidetached mode is typified by utility systems such as compilers, database systems, and editors. Other characterizations are that the project team is medium-size, some innovation is required, constraints and deadlines are moderate, and the development environment is somewhat fluid.

Embedded

The embedded mode is typified by real-time systems such as those for air traffic control, ATMs, or weapon systems. Other characterizations are that the project team is large, a great deal of innovation is required, constraints and deadlines are tight, and the development environment consists of many complex interfaces, including those with hardware and with customers.

COCOMO Levels

Three levels of detail allow the user to achieve greater accuracy with each consecutive level.

Basic

This level uses only size and mode to determine the effort and schedule. It is useful for fast, rough estimates of small to medium-size projects.

COCOMO Mode Characteristics

Intermediate

This level uses size, mode, and 15 additional variables to determine effort. The additional variables are called "cost drivers" and relate to product, personnel, computer and project attributes that will result in more effort or less effort required for the software project. The product of the cost drivers is known as the environmental adjustment factor (EAF).

Detailed

This level builds upon intermediate COCOMO by introducing the additional capabilities of phase-sensitive effort multipliers and a three-level product hierarchy. The intermediate level may be tailored to phase and product level to achieve the detailed level. An example of phase-sensitive effort multipliers would be consideration of memory constraints when attempting to estimate the coding or testing phases of a project. At the same time, though, memory size may not affect the effort or cost of the analysis phase. This will become more obvious after the effort multipliers (or cost drivers) are described. Phase-sensitive multipliers are usually reserved for use in mature organizations and need the use of an automated tool.

A three-level product hierarchy consists of system, subsystem, and module, much like the arrangement of a WBS. Large projects may be decomposed into at least three levels so that each of the cost drivers introduced in intermediate COCOMO is assigned to that level likely to be most influenced by variations in the cost driver. For instance, an engineer's language experience may apply only at the module level, an analyst's capability may apply at a subsystem level and a module level, and required reliability may apply at the system, subsystem, and module level. As with the phase-sensitive multipliers, this will make more sense after the cost drivers are described. As with the phase-sensitive multipliers, mature organizations with automated tools are the heaviest users.

Basic COCOMO

Effort Estimation

KLOC is the only input variable. An exponential formula is used to calculate effort, as shown in Box 1.

COCOMO Basic Effort Formula

As pointed out by Dr. Frailey, effort is measured in staff-months (19 days per month or 152 working hours per month), the constants a and b can be determined by a curve fit procedure (regression analysis), matching project data to the equation. Most organizations do not have sufficient data to perform such an analysis and begin by using Boehm's three levels of difficulty that seem to characterize many software projects. Box 2 shows the basic formulas; Table 2 lists the effort and development time formulas for each mode.

Basic COCOMO Effort Formulas for Three Modes

Basic COCOMO Effort and Development Time Formulas

The same-size project yields different amounts of effort when it is considered to be of different modes:

Suppose that a project was estimated to be 200 KLOC. Putting that data into the formula

Effort = a x (Size)b

Effort for the organic mode would be estimated at 2.4 x (200)1.05 = 2.4(260.66) = 626 staff-months.

Effort for the semidetached mode would be estimated at 3.0 x (200)1.12 = 3.0(377.71) = 1,133 staff-months. Effort for the embedded mode would be estimated at 3.6 x (200)1.20 = 3.6(577) = 2,077 staff-months.

After effort is estimated, an exponential formula is also used to calculate a project duration, or completion time (time to develop, TDEV). Project duration is expressed in months.

Basic COCOMO Project Duration Estimation

Boehm devised three formulas (see Box 3) to be used for development time in the same fashion as he did with effort.

Basic COCOMO Project Duration Estimate

When effort (E) and development time (TDEV) are known, the average staff size (SS) to complete the project may be calculated, as shown in Box 4.

Basic COCOMO Average Staff Estimate

Basic COCOMO Average Staff and Productivity Estimation

When average staff size (SS) is known, the productivity level may be calculated, as shown in Box 5.

Basic COCOMO Productivity Estimate

Basic COCOMO Examples

Two examples of basic COCOMO follow. One is simple, and the other is of medium complexity.

Basic COCOMO Example 1

A development project is sized at 7.5 KLOC and is evaluated as being simplein the organic mode.

The basic COCOMO equation for effort (E) in staff-months (SM) is:

Effort (SM) = 2.4(KLOC)1.05 = 2.4(7.5)1.05 = 2.4(8.49296) = 20 staff-months

Development time (TDEV) can also be found by using the basic COCOMO formulas:

TDEV = 2.5(SM)0.38 = 2.5(20)0.38 = 2.5(3.1217) = 8 months

The average number of staff members (S):

Staff = Effort TDEV = 20 staff-months 8 months = 2.5 staff members on average

The productivity rate (P):

Productivity = Size Effort = 7,500 LOC 20 staff-months = 375 LOC/staff-month

Basic COCOMO Example 2

A development project is estimated to be about 55 KLOC when complete and is believed to be of medium complexity. It will be a Web-enabled system with a robust back-end database. It is assumed to be in the semidetached mode.

For a rough estimate of the effort that will be required to complete the project, use the formula:

E (effort in staff-months) = 3.0(KLOC)1.12

E (effort in staff-months) = 3.0(55)1.12

E = 3.0(88.96)

E = 267 staff-months

To determine how long it will take to complete the project, use the formula:

TDEV = 2.5 x (E)0.35

TDEV = 2.5 x (267)0.35

TDEV = 2.5(7.07)

TDEV = 17.67 months

To obtain a rough estimate of how many developers will be needed, use the formula:

S (average staff) = effortTDEV

S (average staff) = 26717.67

S (average staff) = 15.11

To determine a rough estimate of the productivity rate, use the formula:

P (productivity) = sizeeffort

P (productivity) = 55,000267

P (productivity) = 206 LOC/staff-month

Basic COCOMO offers a way to calculate a set of quick estimates for effort, development time, staffing, and productivity rates, given knowledge only of size and mode. It can be arrived at with no more than a calculator. However, you get what you pay for. It doesn't take much to derive effort using the basic level, and the results won't be worth much more than a very, very rough estimate. To refine the estimation process, Boehm gives guidance in "tuning" via what he calls a complexity adjustment factor described in intermediate COCOMO.


Tags

software projects, regression analysis, variables, cost drivers
The contents available on this website are copyrighted by TechPlus unless otherwise indicated. All rights are reserved by TechPlus, and content may not be reproduced, published, or transferred in any form or by any means, except with the prior written permission of TechPlus.
Copyright 2018 SPMInfoBlog.
Designed by TechPlus