Understanding Quality Scores
Quality scores are Massu AI's way of quantifying the quality of AI-assisted development sessions. Each session gets a score from 0 to 100 based on weighted events that occur during the session.
How Scoring Works
Base Score
Every session starts with a base score of 50. This represents a neutral session where nothing particularly good or bad happened.
Event Weights
Events during the session adjust the score up or down:
| Event | Default Weight | Description |
|---|---|---|
clean_commit | +5 | A commit with no issues |
successful_verification | +3 | A verification check passed |
vr_pass | +2 | Any VR check passed |
cr_violation | -3 | A canonical rule was violated |
bug_found | -5 | A bug was discovered |
vr_failure | -10 | A verification check failed |
incident | -20 | A production-impacting incident |
Score Range
The final score is clamped to 0-100:
- 80-100: Excellent session -- clean commits, all verifications passed
- 60-79: Good session -- minor issues but overall positive
- 40-59: Average session -- some failures or violations
- 20-39: Poor session -- significant issues
- 0-19: Critical session -- multiple failures or incidents
Category Breakdown
Quality scores are broken down into categories:
| Category | What It Measures |
|---|---|
security | Security findings and fixes |
architecture | Architectural decisions and coupling |
coupling | Code coupling and dependency issues |
tests | Test coverage and verification results |
rule_compliance | Adherence to configured rules |
An observation contributes to a category when its description mentions that category keyword.
Customizing Weights
Override default weights in your config:
analytics:
quality:
weights:
bug_found: -10 # Penalize bugs more heavily
clean_commit: 10 # Reward clean commits more
vr_failure: -15 # Make verification failures more costly
categories:
- security
- architecture
- coupling
- tests
- rule_compliance
- performance # Add custom categoriesTracking Quality Over Time
Use massu_quality_trend to see how your scores change across sessions. An improving trend means your AI-assisted development is getting better over time.
Common patterns:
- Improving trend: Team is learning, rules are effective, fewer mistakes
- Flat trend: Steady state -- quality is consistent but not improving
- Declining trend: New complexity, rule gaps, or changing team members
Tips
- Quality scores are most useful as trends, not absolute numbers
- A score of 50 is not bad -- it means a neutral session with no notable events
- Focus on reducing
vr_failureandcr_violationevents for the biggest score improvements - Add custom categories to track quality dimensions specific to your project
- Share quality trends with your team to build awareness