Agile For Data Science
Agile philosophy applied to data science
What is Agile (Really)?
Most teams say they're Agile.
What they usually mean:
- Two-week sprints
- Using Jira or Trello
- Stand-ups
- Backlog grooming
That's ceremony. Teams can adopt these practices and still struggle to ship impactful products quickly.
The Agile Manifesto is four sentences long - no mention of Scrum, sprints, velocity, or other terms we're quick to adopt to claim "Agile". It prioritizes individuals over process, working solutions over documentation, collaboration over contracts, and adaptation over rigid plans.
At its core, Agile is a philosophy for building under uncertainty. And few disciplines operate in more uncertainty than data science.
Data Science Is Not Deterministic
Traditional software development is largely deterministic:
- Requirements are defined
- Behavior is predictable
- Scope can be estimated
Data science is probabilistic. You don't know:
- Whether the data is reliable
- What the performance ceiling is
- Whether the business will adopt the output
You are not just building - You are discovering.
If you apply the same practices that work in software development - fixed sprint scope, velocity tracking, rigid commitments - you create fake certainty around inherently uncertain work.
Let's explore an example: "Add weather data to improve forecast accuracy."
Immediately, uncertainty appears:
- How much accuracy improvement is necessary to change decisions?
- Is weather even predictive at the necessary level of granularity?
- Should you focus on extreme events, seasonality, or day-to-day variations?
You can deterministically scope:
- Data ingestion
- Feature engineering
- Model retraining
- Deployment
But in reality, you might discover:
- Weather is not statistically significant
- Operational factors wash out incremental gains
- The real issue is forecast bias
If you committed up front to a fixed sprint plan, you're now trapped defending sunk cost.
Agile only works in data science if you apply the philosophy, not just the rituals.
The Four Values — Reinterpreted for Data Science
1. Individuals and Interactions Over Processes and Tools
This principle wasn’t written to attack engineering rigor. It was written to challenge bureaucracy.
Process is useful.
But when process becomes a substitute for thinking, agility dies.
In data science, this shows up as ceremony:
- Rigid sprint scope in exploratory work
- Story point debates around uncertain tasks
- “That’s not in the ticket”
- Formal intake workflows that discourage reframing the problem
Strong data science requires judgment. It requires real conversations that change direction when new information emerges.
Agile doesn’t reject process.
It rejects process as a replacement for thinking.
2. Working Solutions Over Comprehensive Documentation
In data science, a working solution means a changed decision.
Not:
- A polished notebook
- A model comparison chart
- A feature importance slide
If nothing in the real system changes, you didn’t deliver value.
Value looks like:
- Reduced operational cost
- Increased revenue
- Automated manual work
Model metrics matter — but only if they translate into operational impact.
3. Customer Collaboration Over Contract Negotiation
Many data science projects begin with solution requests:
- “Improve forecast accuracy by 5%”
- "Use AI to automate the process"
- “Build an optimization”
These are guesses, not problems.
Agile collaboration means redefining the problem together.
Ask:
- What decision are we improving?
- What happens if we do nothing?
- What constraints can’t move?
- What does success look like in business terms?
Requirements in data science are hypotheses.
Treat them as such.
4. Responding to Change Over Following a Plan
Executives want certainty:
- Fixed timelines
- Defined scope
- Guaranteed savings
Data science offers probabilities.
As you explore:
- Assumptions break
- Data limitations surface
- Signal disappoints
- Operational constraints emerge
Rigid plans encourage defending bad paths too long.
The goal isn’t to follow the plan.
The goal is to reduce uncertainty quickly and intelligently.
Why Agile Fails in Data Science
Many data science teams “do Agile” and still miss impact.
Because they optimize for activity:
- Experiments run
- Models deployed
- Tickets closed
Instead of:
- Decision improvement
- Adoption
- Measurable business value
- Durable systems
Busy is not effective.
The Bottom Line
Agile isn't about sprint cadence.
It's about building when the future isn’t fully known.
That’s data science.
But uncertainty is not an excuse for endless research.
Agility requires both flexibility and constraint:
Flexibility to pivot when evidence changes.
Constraint to force real-world delivery.
If decisions don’t change, the work isn’t done.
In the next post, we’ll move from philosophy to practice — and examine what the 12 Agile principles demand of real data science teams.