Game C • Size-Effort statistical models

Evaluation of size-effort statistical models – linear regression

Learning objective

Teach students how to adequately analyse the parameters of a linear regression model built from software project data: sample size, R², outliers, and MMRE.

What the player sees

  • An Excel questionnaire file containing the full steps for building each regression model
  • For each data subset: Grubbs test results, ANOVA table, model summary, regression coefficients, and MMRE
  • 10 scenarios available as artifacts in SimSE, one per data subset
  • Free access to the Excel file throughout the entire game

The statistical calculations are pre-computed. The player focuses on interpretation, not calculation.

What the player does

  1. Open the Excel questionnaire and locate the sheet for the current scenario.
  2. Read the statistical results: number of projects (N), outliers, R², and MMRE.
  3. Answer Q1 (sample size), Q2 (R²), Q3 (outliers), and Q4 (MMRE).
  4. Enter the answers in SimSE and move to the next scenario.
  5. Repeat until all 10 scenarios are completed.

Each scenario is answered once and cannot be revisited.

The four questions (answered for each scenario)

For each of the 10 data subsets, the player answers the same four questions based on the Excel results.

Q1
Sample size – "Is it valid to draw statistical conclusions from this sample size?"
High – big enough to trust the statistical results.
Medium – barely enough; results are moderately supported.
None – much too small for any statistical conclusions.
Q2
R² – "How is the relationship between size and effort?"
High – strong size-to-effort relationship; other project characteristics have little impact.
Medium – moderate relationship; other characteristics have a moderate impact.
Low – weak relationship; other characteristics also have a strong impact on effort.
Q3
Outliers – "What is the impact when there is at least one outlier?"
None – no outlier detected; proceed with the regression analysis as is.
One or more – each outlier must be removed and the regression analysis must be redone.
Q4
MMRE – "How wide is the range of effort variation?"
High – very wide range of effort variation.
Medium – moderate range of effort variation.
Low – very small range of effort variation.

Data subsets

The Excel file contains results for the following 10 subsets, each filtered from a database of 90 ISBSG projects.

Scenario Subset name Filter criteria Total projects
1 Business-Web-New-Java Application: Business • Platform: Web • Type: New development • Language: Java 37
2 Real time-Maintenance-Java Application: Real time • Type: Maintenance • Language: Java 12
3 Real time-Maintenance Application: Real time • Type: Maintenance 14
4 Real time-C Application: Real time • Language: C 18
5 Maintenance Type: Maintenance 40
6 Business-New Application: Business • Type: New development 44
7 Real time Application: Real time 20
8 New Type: New development 50
9 C Language: C 20
10 Java Language: Java 70

Teaching notes

This game trains students to practise interpretation of regression statistics rather than calculation.

Suggested use in class

  • Review how to read a regression output before the game (Grubbs, ANOVA, model summary, coefficients)
  • Walk through the evaluation thresholds for N, R², and MMRE with students before they start
  • Compare results across subsets to discuss why different filters lead to different outcomes
  • Use the four questions as a debriefing framework after the session

What to evaluate

  • Correct interpretation of sample size validity (Q1)
  • Ability to qualify the strength of the size-effort relationship from R² (Q2)
  • Understanding of the impact of outliers on the regression analysis (Q3)
  • Correct reading of MMRE as a measure of effort variation width (Q4)

Downloads & documentation

Access game files, manuals, and supporting resources.

Questionnaire: game-c-questionnaire.html