No description

Go 98.7%
Python 0.8%
Shell 0.3%
Dockerfile 0.2%

Find a file

StefanSA f56b45232a All checks were successful ci / go-test (push) Successful in 10m48s Details ci / release-smoke (push) Successful in 15m27s Details ci: migrate release publishing to Forgejo		2026-06-23 19:57:51 +02:00
.forgejo/workflows	ci: migrate release publishing to Forgejo	2026-06-23 19:57:51 +02:00
.githooks	Initial import for Codeberg	2026-04-08 15:14:52 +02:00
cmd	Prepare v1.2.4 release for neutral ML topic	2026-04-09 15:08:13 +02:00
configs	Prepare v1.2.4 release for neutral ML topic	2026-04-09 15:08:13 +02:00
docs	Prepare v1.2.4 release for neutral ML topic	2026-04-09 15:08:13 +02:00
internal	Prepare v1.2.4 release for neutral ML topic	2026-04-09 15:08:13 +02:00
scripts	chore: remove local governance from publish surface	2026-06-23 14:45:00 +02:00
testdata	Prepare v1.2.4 release for neutral ML topic	2026-04-09 15:08:13 +02:00
.dockerignore	Align publish surface ignore policy	2026-04-11 22:56:06 +02:00
.gitignore	chore: remove local governance from publish surface	2026-06-23 14:45:00 +02:00
.goreleaser.yaml	Add Forgejo CI and release automation	2026-04-08 15:24:01 +02:00
CHANGELOG.md	ci: make release creation idempotent (skip if release exists)	2026-04-23 13:56:29 +02:00
Dockerfile	Update docker base images	2026-04-17 15:59:22 +00:00
go.mod	Update module github.com/oschwald/geoip2-golang to v2	2026-04-17 15:59:49 +00:00
go.sum	Update module github.com/oschwald/geoip2-golang to v2	2026-04-17 15:59:49 +00:00
Makefile	Initial import for Codeberg	2026-04-08 15:14:52 +02:00
README.md	ci: standardize release notes and semver tag policy	2026-04-23 14:18:57 +02:00
renovate.json	chore(renovate): run gomod tidy after go updates	2026-04-17 17:40:06 +02:00

README.md

syslog-ecs-analyzer

Dynamic syslog-to-ECS analyzer with explicit stage boundaries, best-effort parsing/decoding, modular enrichment, and conservative normalization.

Scope

Implemented now:

explicit internal stage models
- RawEvent
- EnvelopeEvent
- ParsedPayloadEvent
- ResolvedEvent
- NormalizedEvent
modular package skeleton under internal/
pipeline interfaces and interface-driven orchestration
static inventory config loading
best-effort RFC3164 parsing
best-effort RFC5424 parsing
payload classification: JSON, key=value, unknown
generic JSON decoding
generic key=value decoding
generic resolver for observed identity, payload identity, and network fields
inventory enrichment for missing canonical identity fields
GeoIP/ASN enrichment for well-identified IPs
DNS reverse enrichment as additive metadata
IP reputation annotation as enrichment metadata
dynamic VictoriaLogs stream-field templates with nested field resolution
worker-based DNS enrichment with positive and negative cache control
cached IP reputation lookups with optional hot reload
conservative ECS-style normalization with provenance retention
table-driven Sophos XGS semantic ECS mapping with strict delete-on-success pruning of promoted vendor.payload fields
production-grade Kafka output mode for downstream anomaly-scorer / ML workflows
per-stage partial/failure status fields
focused unit tests for precedence, provenance, enrichment, normalization, and Kafka keying

Not implemented yet:

vendor-specific decoder packs or rule-engine logic

Architecture boundary

This project is the semantic/ECS intelligence layer, not the production detection engine.

In-project responsibilities:

parsing and decoding
generic semantic interpretation
ECS normalization and semantic lifting
explainability and confidence
family classification and generic alias handling
optional AI/ML/LLM-assisted semantic inference for field meaning
offline corpus analysis, feature export, baseline export, and scoring research
Kafka producer contract preparation for downstream consumers

Downstream responsibilities via Kafka:

anomaly scoring
temporal/contextual baselines in production
stateful behavior modeling
thresholding and detection logic
retraining
alerting

Decision rule:

if the logic answers "what does this field or event mean semantically?" it belongs here
if the logic answers "given a semantically understood stream, is this behavior anomalous over time/context?" it belongs in the external anomaly-scorer

The explicit architecture decision is documented in docs/architecture/adr_semantic_engine_vs_external_anomaly_scorer.md.

Repository layout

internal/input
internal/syslog
internal/classifier
internal/decoder
internal/registry
internal/resolver
internal/enrich
internal/ecs
internal/provenance
internal/output
internal/model
internal/pipeline
testdata/fixtures
docs/architecture

Third-Party References

This project uses publicly available log samples and reference data from the Elastic integrations repository for validation and testing purposes.

See docs/THIRD_PARTY.md for details.

Config

Example skeleton config: configs/syslog-ecs-analyzer.yaml Full image config: configs/syslog-ecs-analyzer.full.yaml

Example static inventory: configs/inventory.example.yaml

Current config surface:

service.name
inventory.static_path
registry.inventory_path
input.udp.*
input.tcp.*
input.replay.*
enrichment.geoip.*
enrichment.dns.*
enrichment.reputation.*
outputs.include_syslog_analyzer
outputs.file.*
outputs.stdout.*
outputs.victorialogs.*
outputs.kafka.*
pipeline.*
observability.listen_addr

Output Filtering (syslog_analyzer)

syslog_analyzer.* is the analyzer-owned metadata layer. It carries internal semantic status, provenance, explainability, semantic-family hints, and non-ECS canonical extension data under syslog_analyzer.canonical_ext.*.

It exists because the analyzer keeps internal reasoning explicit instead of silently flattening it into ECS or vendor payload. That metadata is useful for:

deterministic normalization flow
provenance and explainability
canonical extension handling
audit and validation work
offline corpus, ML feature export, and suggestion workflows

Some environments still prefer a cleaner emitted JSON shape. For that case, final output emission can suppress the top-level syslog_analyzer object:

outputs:
  include_syslog_analyzer: false

Behavior:

true or omitted: emit syslog_analyzer.* exactly as before
false: omit only the top-level syslog_analyzer object from emitted JSON

Important boundary:

this is output filtering only
parsing is unchanged
normalization is unchanged
ECS mapping is unchanged
canonical extension decisions are unchanged
cleanup and full-equivalence pruning are unchanged
offline ML and suggestion logic are unchanged

Internal computation still happens even when output filtering is disabled. This matters because other final-output features can still resolve from internal analyzer fields before suppression. For example, a VictoriaLogs stream field such as network_scope={syslog_analyzer.network.scope} can still be populated while the emitted event body omits syslog_analyzer.

Testing

The host environment does not have Go installed, so tests run in Docker:

make test

For local guardrails, enable the tracked hooks with git config core.hooksPath .githooks.

Runtime

The repository now includes a runnable analyzer binary at cmd/syslog-ecs-analyzer plus a container image Dockerfile. The same image contains all supported runtime capabilities; features are enabled or disabled by configuration. The minimal runtime surface is:

UDP syslog listener
TCP line-oriented syslog listener
one-shot replay input for seeded test evidence
file output
stdout output
VictoriaLogs JSONLine output
optional Kafka output for downstream anomaly-scoring / ML consumers
/healthz and /status observability endpoints

Safe output defaults:

outputs.victorialogs.enabled=true
outputs.kafka.enabled=false
outputs.stdout.enabled=false
outputs.file.enabled=false

No event stream is written to stdout or disk unless that output is explicitly enabled. If stdout or file output is enabled, the process logs a startup warning.

outputs.include_syslog_analyzer controls only final emitted JSON. When set to false, the top-level syslog_analyzer object is omitted from emitted output, but internal normalization, canonical extension handling, cleanup decisions, and offline tooling inputs remain unchanged.

This behavior is validated in the real test environment:

the running service can emit events without top-level syslog_analyzer
ECS fields still appear as before
full-equivalent raw cleanup still behaves the same
internal analyzer fields can still contribute to final stream-field rendering before the output object is filtered
no runtime ML, suggestion execution, or promotion execution is introduced by this filter

The isolated victoriaflow phase1 integration is documented in docs/testing/victoriaflow-test-integration.md. The recommended active victoriaflow runtime is documented in docs/testing/victoriaflow-full-image.md. Operator query examples for compact interface and trailing-context fields are in docs/queries/trailing_context_queries.md.

For /daten/victoriaflow, the explicit runtime transition strategy is:

keep syslog-ecs-analyzer-phase1 and syslog-ecs-analyzer-full as separate services
stop and remove phase1 before starting full
verify the active container name and image tag after cutover

This avoids ambiguous in-place replacement and keeps rollback straightforward.

Release Conventions

Public git release tags follow the v* scheme, for example v1.2.3.

Releases are CI-owned and idempotent (see docs/release/RELEASE_WORKFLOW.md). The Forgejo/Codeberg release body is published from the matching CHANGELOG.md section for the tagged version. Stable release tags use vX.Y.Z; release candidates use vX.Y.Z-rcN only when explicitly justified; date-based tags are historical only and must not be created going forward.

When publishing or deploying immutable release images, use the same version string for the container tag, for example:

git tag: v1.2.3
image tag: syslog-ecs-analyzer:v1.2.3

Phase or ad-hoc tags such as phase-* remain local or environment-specific rollout markers. They are not a substitute for public release tags.

Performance Improvements

The current production release keeps the same normalization behavior while reducing CPU and allocation pressure in the normalization hot path.

The main internal improvement is a shared payload leaf index with mutation invalidation:

repeated semantic leaf traversal is removed from the common lookup path
repeated per-lookup sorting of payload keys is removed
allocation and GC pressure are reduced substantially
throughput improved materially across the replay-audited datasets, with the largest gain on SonicWall-heavy compact-endpoint traffic

The performance work is behavior-preserving:

parsing semantics are unchanged
semantic mapping outcomes are unchanged
cleanup behavior is unchanged
structured trailing-context retention is unchanged
NO DUPLICATION if ECS equivalent exists remains enforced

Compact Endpoint Handling

The generic semantic layer now supports compact endpoint values in these shapes when the structure is safe:

IP:PORT
IP:PORT:TOKEN
IP:PORT:TOKEN:CONTEXT

Current behavior:

IP promotes to source.ip or destination.ip
PORT promotes to source.port or destination.port
the first trailing token promotes conservatively to:
- source -> observer.ingress.interface.name
- destination -> observer.egress.interface.name
one additional unmatched trailing context token is preserved explicitly under:
- vendor.payload.src_compact.trailing_context
- vendor.payload.dst_compact.trailing_context

This trailing context retention is non-ECS and intentionally semantically neutral. It preserves the unmatched compact suffix without claiming that it is a zone, segment, or other higher-level semantic meaning.

For operator-facing query examples against these fields, see docs/queries/trailing_context_queries.md.

Example:

src=192.168.0.19:60951:X0:STG-IT-PBU-01

Normalizes to:

source.ip=192.168.0.19
source.port=60951
observer.ingress.interface.name=X0
vendor.payload.src_compact.trailing_context=STG-IT-PBU-01

If additional trailing detail still remains unmatched, the original raw compact field is retained conservatively.

No Duplication Rule

The analyzer enforces a strict cleanup rule:

full equivalence -> raw/vendor field removed
partial equivalence -> raw/vendor field retained

This applies to both ECS mappings and structured non-ECS compact-context retention.

Examples:

src=203.125.116.98:47758:X1
- removed once the value is fully represented by endpoint ECS fields plus interface name
src=192.168.0.19:60951:X0:STG-IT-PBU-01
- removed once the value is fully represented by endpoint ECS fields, interface name, and structured trailing context retention
src=1.2.3.4:12345:X0:LABEL:EXTRA
- retained because not all original trailing detail is represented yet

The same rule applies to structured proto values:

proto=udp/dns and proto=tcp/https
- raw removed once fully represented by network.transport plus network.application
proto=6 or proto=udp/389
- raw retained when canonical representation is still only partial

Sophos XGS Mapping

The normalizer now uses a generic semantic mapper as the primary path. That generic layer handles reusable mappings such as timestamp, severity, action aliases, IPs, ports, MACs, transport, URL and HTTP fields, ICMP fields, and device-context-based observer.* promotion.

Sophos XGS is now a thin extension layer on top of that generic path. It keeps only the product-specific deltas:

fw_rule_*
interface and zone keys
NAT alias keys
family categorization defaults
Invalid Traffic -> error.message

The generic strategy is documented in docs/architecture/generic-semantic-mapping.md. The Sophos-specific delta, mapping matrix, delete-on-success behavior, and live victoriaflow verification queries are in docs/architecture/sophos-xgs-ecs-mapping.md.

The currently validated narrow canonical extension scope includes Sophos XG srczonetype and dstzonetype, promoted only into:

syslog_analyzer.canonical_ext.observer.ingress.zone.type
syslog_analyzer.canonical_ext.observer.egress.zone.type

This remains vendor-scoped, deterministic, and cleanup-safe:

promotion only when raw evidence is present and non-empty
raw removed only on full equivalence
raw retained on partial, empty, or conflicting cases

Reference corpus

The repo now includes a project-owned reference fixture corpus in testdata/fixtures/reference/elastic_filebeat_corpus.yaml. It is informed by publicly available Elastic Filebeat module examples used as validation/reference material only, especially:

Sophos XG firewall samples and expected mappings
system/syslog sample lines and expected mappings

This corpus is used for semantic intent and golden coverage only. It does not turn this project into a Filebeat clone and it does not introduce static module-specific parser logic.

The offline corpus layer now also supports real Elastic integrations package fixtures as review-only evidence for:

semantic family learning
canonical extension candidate discovery
ECS candidate suggestion generation
delta analysis against local proxy-only corpus runs

Elastic expected JSON is treated as supporting evidence only. It is not copied mechanically into runtime behavior, does not imply source-code reuse, and is never sufficient on its own to justify destructive cleanup.

Enrichment precedence

Observed values remain the highest-trust evidence by default.
Static inventory may fill canonical identity fields when sender identity is missing or incomplete.
DNS, GeoIP/ASN, and reputation are additive enrichment layers.
Enrichment never silently replaces observed canonical identity.
Final ECS fields may expose a selected value, but internal provenance keeps whether it was observed or enriched.

VictoriaLogs stream fields

VictoriaLogs stream labels are now config-driven and template-based under outputs.victorialogs.stream_fields. The field list accepts either the legacy CSV string or a YAML list.

Supported syntax:

direct: stream_host
alias: device={observer.serial_number}
static: job=integrations/syslog-ecs-analyzer
templated static: stream=sophos-{vendor.payload.log_type}

Missing field behavior is controlled by:

outputs.victorialogs.missing_field_action: skip or fallback
outputs.victorialogs.missing_field_fallback: fallback value used when missing_field_action=fallback

Sophos XGS-oriented example:

outputs:
  victorialogs:
    enabled: true
    url: http://127.0.0.1:9428/insert/jsonline
    stream_fields:
      - job=integrations/syslog-ecs-analyzer
      - stream_host
      - device={observer.serial_number}
      - product={observer.product}
      - log_type={vendor.payload.log_type}
      - network_scope={syslog_analyzer.network.scope}
    time_field: _time
    missing_field_action: skip

Notes:

Nested source paths are resolved safely at runtime.
Missing fields never panic the sink.
host={host.name} is handled safely by emitting host.name as the effective stream field.
Other aliases or static names that would collide with structured ECS object roots are rejected at sink startup. Prefer non-conflicting aliases such as stream_host.
Stream template expansion is isolated to the VictoriaLogs sink and does not alter ECS normalization or enrichment decisions.

DNS runtime

DNS enrichment now uses a bounded worker pool with background reverse lookups. Operational behavior:

cache hits enrich immediately
cache misses enqueue a lookup and do not block the pipeline
positive and negative results are cached separately
observed or inventory-backed host identity is never overwritten by PTR results
source/destination scope classification is used before deciding lookup eligibility

Config surface:

enrichment.dns.hosts_path
enrichment.dns.timeout_ms
enrichment.dns.workers
enrichment.dns.queue_size
enrichment.dns.cache_ttl_ms
enrichment.dns.negative_ttl_ms
enrichment.dns.cache_size
enrichment.dns.resolve_private
enrichment.dns.resolve_public
enrichment.dns.servers

Emitted fields:

source.dns.ptr_name
destination.dns.ptr_name
compatibility retention in source.domain / destination.domain
internal host PTR evidence in syslog_analyzer.enrich.host.ptr_name

Tuning guidance:

keep resolve_private=false unless private PTR data is operationally useful
increase workers only when DNS latency is materially higher than event rate
increase negative_ttl_ms to suppress repeated NXDOMAIN/timeouts under noisy traffic
use hosts_path for deterministic local overrides before enabling live reverse lookups

Reputation runtime

Reputation enrichment remains additive only. It never rewrites event.action, event.outcome, severity, or core ECS identity fields.

Config surface:

enrichment.reputation.ip_path
enrichment.reputation.cache_ttl_ms
enrichment.reputation.negative_ttl_ms
enrichment.reputation.cache_size
enrichment.reputation.reload_interval_ms
enrichment.reputation.provider_name

Behavior:

exact IP matches are cached for repeated lookups
misses are negative-cached to avoid repeated table scans
provider/feed context is preserved in threat.enrichments
optional reload checks can refresh the table without process restart
public/private scope classification is used before deciding lookup eligibility

Sophos XGS example:

src_ip or dst_ip that matches the reputation table produces additive threat.enrichments[*].indicator.* metadata while leaving the original firewall action unchanged

IP scope classification

The analyzer now derives a shared IP scope classification for:

source.ip
destination.ip
source.nat.ip
destination.nat.ip

Single-IP scope values:

private
public
unknown

Overall event scope values:

private
public
mixed
unknown

Classification rules:

invalid or missing IP: unknown
RFC1918 private IPv4, IPv6 ULA, loopback, and link-local addresses: private
valid non-private, non-unspecified, non-multicast addresses: public
unspecified and multicast addresses: unknown
event scope uses source.ip and destination.ip
- both private: private
- both public: public
- one private and one public: mixed
- if either side is missing or unknown: unknown

Internal fields exposed in normalized output:

syslog_analyzer.source.scope
syslog_analyzer.destination.scope
syslog_analyzer.source.nat.scope
syslog_analyzer.destination.nat.scope
syslog_analyzer.network.scope

Eligibility use:

GeoIP/ASN: public only
DNS: controlled by enrichment.dns.resolve_private and enrichment.dns.resolve_public
Reputation: public only

The classification is additive metadata only. It does not overwrite ECS network.* or any observed IP field. Further scope details are in docs/architecture/phase6-ip-scope.md.

Kafka downstream scoring output

The Kafka sink follows a queue-backed, production-oriented pattern aligned with flowcollector-go, but adapted to syslog events. Kafka is the intended production handoff boundary to the external anomaly-scorer. This repository prepares a stable normalized contract; it does not perform production anomaly detection itself.

Key strategy

Kafka keys use a deterministic syslog grouping shape:

identity|family|discriminator
identity: observer.serial_number, then host.name, then observer.name, then log.syslog.hostname, then source.ip
family: vendor.payload.log_type + vendor.payload.log_component when present, then event.category, then process.name, log.syslog.appname, observer.product, observer.vendor
discriminator: event.code, then vendor.payload.log_subtype or event.action, then log.syslog.msgid, then generic

This keeps partitioning stable per device and log family while avoiding the high-cardinality fallback of hashing every message body. Same key means same Kafka partition because the sink uses Kafka hash partitioning.

Downstream contract intent

The stable downstream contract is centered on normalized semantic fields and compact quality signals, especially:

syslog_analyzer.semantic.family
event.category
event.type
event.action
event.outcome
event.code
event.severity
network.transport
source/destination presence and scope
observer.name and host.name when available
explainability summaries such as explain counts or top skip reason
confidence summaries such as family_confidence and mapping_confidence

outputs.kafka.ml_mode remains a compatibility switch for downstream consumers that want explicit producer-side metadata such as ml.mode and ml.key. It does not turn this process into the primary anomaly engine. When the normalized event already contains enough evidence, ml_mode also emits the scorer-facing canonical ML fields expected by flowcollector-ml, including:

flow.client.ip.addr
flow.server.ip.addr
flow.client.l4.port.id
flow.server.l4.port.id
l4.proto.name

flow.bytes and flow.packets remain intentionally absent for syslog-derived events unless the analyzer owns a safe total-counter semantic.

Producer and routing config

Kafka config surface:

outputs.kafka.brokers
outputs.kafka.topic
outputs.kafka.client_id
outputs.kafka.acks
outputs.kafka.compression
outputs.kafka.linger_ms
outputs.kafka.batch_size
outputs.kafka.retries
outputs.kafka.retry_backoff_ms
outputs.kafka.queue_max_messages
outputs.kafka.queue_block_ms
outputs.kafka.key_mode
outputs.kafka.ml_mode

Minimal routing controls:

outputs.kafka.route.enabled_path
outputs.kafka.route.log_types
outputs.kafka.route.event_categories
outputs.kafka.route.event_codes

Example:

outputs:
  kafka:
    enabled: true
    brokers:
      - kafka:9092
    topic: normalized.ml.events
    client_id: syslog-ecs-analyzer
    acks: all
    compression: zstd
    linger_ms: 250
    batch_size: 262144
    retries: 5
    queue_max_messages: 8192
    queue_block_ms: 0
    key_mode: syslog_identity
    ml_mode: true
    route:
      log_types: [Firewall, Content Filtering]
      event_codes: ["010101600001", "050901616001"]

Runtime behavior

Kafka delivery is isolated behind an internal queue and background worker.
Write enqueues events quickly so Kafka backpressure does not stall the rest of the output path.
Queue-full conditions are counted as Kafka drops instead of breaking VictoriaLogs delivery.
Delivery retries use bounded exponential backoff.
/status now includes Kafka queue depth and delivery/error counters when Kafka output is enabled.

Observability

Kafka status counters include:

sent events
failed events
dropped events
skipped events
retry attempts
encode errors
queue depth
queue capacity

Limitations

Topic selection is currently one configured topic per sink instance.
Per-family or per-device topics are intentionally not implemented yet.
Background delivery failures are surfaced through Kafka sink metrics rather than by failing the primary VictoriaLogs path.
Stateful anomaly scoring, temporal baselines, and alerting remain downstream concerns.

Offline corpus evaluation

The offline corpus runner and its feature/baseline/scoring modes are evaluation tooling. They are kept in-repo to:

validate normalization quality
test whether exported semantic features are useful downstream
provide regression evidence across corpora
prototype score-feature usefulness before handing ideas to the external scorer

These offline tools do not redefine the production architecture. Production anomaly scoring remains downstream via Kafka.

Victoriaflow

The analyzer is integrated as an isolated test component via:

/daten/victoriaflow/docker/docker-compose.syslog-ecs-analyzer-phase1.yml
/daten/victoriaflow/config/syslog-ecs-analyzer/config.yaml
/daten/victoriaflow/config/syslog-ecs-analyzer/inventory.yaml
/daten/victoriaflow/config/syslog-ecs-analyzer/replay.log

Primary proof boundary:

VictoriaLogs through the existing obswrapper query path on http://127.0.0.1:3100

Secondary structured proof boundary:

/daten/victoriaflow/volumes/syslog-ecs-analyzer/events.ndjson

The exact build, start, verify, stop, and rollback commands are in docs/testing/victoriaflow-test-integration.md. The full-image variant is documented in docs/testing/victoriaflow-full-image.md.

Full image defaults

One full-featured image is built from Dockerfile. The features compiled into that image are:

syslog input
VictoriaLogs output
Kafka output
static inventory enrichment
IP scope classification
GeoIP/ASN enrichment
DNS enrichment
IP reputation enrichment
dynamic VictoriaLogs stream fields
health and status endpoints

Design intent

keep stage boundaries explicit
keep future parser/decoder/resolver/normalizer implementations replaceable behind interfaces
keep enrichment modular and provenance-aware
treat flowcollector-go as a reference for enrichment and output patterns only where it fits syslog semantics

Further design details for the runtime extensions are in docs/architecture/phase6.md. The full-image packaging notes are in docs/architecture/phase6-full-image.md.