How Our AI Learns: Training, Data & Confidence
A common question from customers evaluating our AI capabilities: "How long until it actually works?" The honest answer is that it depends—on your data quality, your process stability, and what you're trying to achieve. Here's what to expect.
What the AI Is Actually Doing
Our AI isn't a magic black box that somehow "understands" water treatment. It's a pattern recognition system that learns what normal looks like for your specific plant, so it can identify when something isn't normal—and correlate those abnormalities with causes and outcomes.
The system learns from multiple data sources:
The Training Timeline
Here's a realistic timeline for what to expect:
If you have historical data—spreadsheets, SCADA exports, lab results—we can import it during commissioning. This can compress months of learning into days. The more context we start with, the faster the AI becomes useful.
Understanding Confidence Levels
Every recommendation the system makes comes with a confidence percentage. Here's what influences that number:
Low Confidence (30-50%)
Limited training data, unusual scenario, or missing sensor data. Treat as "possible cause—investigate."
Medium Confidence (50-75%)
Pattern matches known scenarios with some uncertainty. Good basis for investigation, but verify before acting.
High Confidence (75%+)
Strong pattern match with historical data, multiple corroborating signals, and consistent with past outcomes.
What Affects Confidence
- Sensor coverage – More data points = more correlation opportunities = higher confidence
- Data quality – Faulty sensors or gaps in data reduce confidence
- Training time – Newer deployments have less to compare against
- Scenario frequency – Common issues are diagnosed with higher confidence than rare edge cases
- User feedback – Accepted/rejected recommendations improve future confidence calibration
Under-Instrumented Plants
A question we get often: "Our plant doesn't have many sensors. Will this even work?"
The short answer: yes, but with caveats.
With limited instrumentation, the AI can still:
- Detect anomalies in the data you do have
- Infer flow rates from tank level changes
- Correlate timing patterns (cycle lengths, duty cycles)
- Learn equipment behaviour from VSD feedback
What it can't do as well:
- Diagnose root causes with high confidence
- Provide specific optimisation recommendations
- Predict failures without leading indicators
The system will show lower confidence levels and provide more general recommendations. It will also identify where additional sensors would provide the most value—essentially doing an ROI analysis on instrumentation investment.
One of the outputs the system can provide is a prioritised list of additional sensors that would most improve diagnostic capability. "Adding a DO sensor would increase anomaly detection confidence by an estimated 25%." This helps justify instrumentation investment.
Variable vs. Stable Operations
The training timeline varies significantly based on how consistent your operation is:
Stable Operations (Municipal, Consistent Industrial)
Plants with predictable, consistent loads learn faster. A municipal STP that sees roughly the same flow every day can establish reliable baselines in 4-6 weeks. The AI quickly learns "Tuesday looks like Monday which looks like every other day."
Variable Operations (Wineries, Food Processing, Seasonal)
Plants with seasonal variation need more time. A winery SBR needs to see both vintage season and off-season to understand normal ranges for each. You won't get reliable vintage-season recommendations until the system has seen at least one vintage season.
This is why historical data import is so valuable for variable operations—we can give the AI years of seasonal patterns on day one.
The Feedback Loop
The AI improves fastest when users engage with its recommendations:
- Accept recommendations – Validates the diagnosis was correct
- Reject with reason – Tells the system why it was wrong
- Add context – "Actually, this was because of scheduled maintenance"
- Correct misdiagnosis – "It wasn't the pump—it was the VSD"
This feedback goes directly into the training loop. A site where operators actively engage with the AI will have significantly better performance than one where recommendations are ignored.
Ready to Start Training?
The sooner data starts flowing, the sooner the AI starts learning. Our pilot programs let you evaluate with minimal commitment while the system builds its understanding of your operation.
Start a PilotGet more from Streamwise D.I.™
Request demo
Like to see Streamwise D.I.™ in action? Please contact us to arrange a demonstration.
Request information pack
Like to learn more about how Streamwise D.I.™ can save you money? Please contact us for an information pack.
Stay in touch
Like to keep in touch with us? Please sign up for our newsletter.