No description
- Python 86.1%
- Shell 13.2%
- Dockerfile 0.7%
| .forgejo/workflows | ||
| .githooks | ||
| anomaly-scorer | ||
| compose | ||
| dashboards/grafana | ||
| docs | ||
| scripts | ||
| tests | ||
| .dockerignore | ||
| .gitignore | ||
| ARCHITECTURE.md | ||
| CHANGELOG.md | ||
| CONTRIBUTING.md | ||
| LICENSE | ||
| PROJECT_POSITIONING.md | ||
| README.md | ||
| renovate.json | ||
| ROADMAP.md | ||
| SCHEMA.md | ||
flowcollector-ml
flowcollector-ml is a lightweight, explainable, and scalable producer-neutral ML anomaly detection service for VictoriaLogs + Grafana, using a streaming backbone (Redpanda/Kafka) and layered detection (baseline + ML).
Repo contents:
anomaly-scorer/— ML scoring service (Kafka/Redpanda → VictoriaLogs)compose/— reference Docker Compose stackdashboards/— Grafana dashboards (JSON)docs/— deployment & ops docs
The scorer consumes a canonical ML input contract over Kafka. Current topic names in this repo are compatibility examples, not a producer-specific requirement. See docs/ml/ML_INPUT_CONTRACT.md.
The currently verified shared ML input topic is normalized.ml.events. flowcollector-go and syslog-ecs-analyzer both publish into that topic, and anomaly-scorer consumes it without producer-specific branching.
Quickstart
cd compose
docker compose up -d
docker compose logs -f anomaly-scorer
Verify topics & lag:
docker compose exec kafka rpk topic list
docker compose exec kafka rpk group describe anomaly-scorer-v1
Verify VictoriaLogs data:
curl -sG 'http://localhost:9428/select/logsql/query' \
--data-urlencode 'query={job="integrations/anomaly-scorer",dataset="ml_metrics"} | sort by (_time desc) | limit 5' \
--data-urlencode 'limit=5'
Grafana queries
Score time series (p95)
_time:$__range
{job="integrations/anomaly-scorer", dataset="score"}
| filter NOT "ml.state":"warmup"
| stats by (_time:$__interval) quantile(0.95, score) as score_p95
Latest anomalies
_time:$__range
{job="integrations/anomaly-scorer", dataset="anomaly"}
| sort by (_time desc)
| limit 200
Bounded Kafka disk usage (example: ~15GB total)
docker compose exec kafka rpk topic alter-config normalized.ml.events \
--set retention.bytes=8589934592 \
--set retention.ms=86400000 \
--set segment.ms=600000
docker compose exec kafka rpk topic alter-config netflow.flows \
--set retention.bytes=6442450944 \
--set retention.ms=21600000 \
--set segment.ms=600000
Release
- Releases are CI-owned and idempotent (see
docs/release/RELEASE_WORKFLOW.md). - The Forgejo/Codeberg release body is published from the matching
CHANGELOG.mdsection for the tagged version. - Stable release tags use
vX.Y.Z; release candidates usevX.Y.Z-rcNonly when explicitly justified; date-based tags are historical only and must not be created going forward.
Deferred Renovate updates
- PR
#4Update https://data.forgejo.org/actions/setup-python action to v6reviewed on2026-04-23 - Status: intentionally deferred
- Reason: major CI dependency update; the stale branch must not be merged as-is because it would also revert current governed release hardening
- Next action: replay the exact
setup-pythonv5 -> v6bump manually onto currentmainand validate it through CI before any merge
Docs
ARCHITECTURE.mdSCHEMA.mddocs/ml/ML_INPUT_CONTRACT.mddocs/DETECTION.mddocs/INTERPRETATION.mddocs/DEPLOYMENT.mddocs/OPERATIONS.mddocs/TROUBLESHOOTING.md
License
Apache-2.0 (see LICENSE).
See also:
docs/releases/v1.1.1.md(release notes for the current release)