The Vector team is pleased to announce version 0.21.0!
Be sure to check out the upgrade guide for breaking changes in this release.
In addition to the new features, enhancements, and fixes listed below, this release adds:
redis source to complement the redis sink.kubernetes_logs source has been rewritten to use the community supported
kube-rs library. We expect that this will resolve some long
outstanding bugs with Vector ceasing to process container logs. It also adds support for Kubernetes
authentication token rotation.We made additional performance improvements this release increasing the average throughput by up to 50% for common topologies (see our soak test framework).
Also, check out our new guide on using vector tap for observing events
running through Vector instances
kubernetes_logs source can panic when while processing Kubernetes watcher events when there is an error. #12245. Fixed in 0.21.1.elasticsearch sink fails to include the security token when signing requests for AWS authentication to OpenSearch. #12249. Fixed in 0.21.1.nats source and sink authentication options were not configurable. #12262. Fixed in 0.21.1.internal_logs source includes excess trace logs whenever vector top is used. #12251. Fixed in 0.21.1.aws_cloudwatch_logs source does not handle throttle responses from AWS. #12253. Fixed in 0.21.1.encoding.only_fields. #12256. Fixed in 0.21.1.0.21.1.assume_role on AWS components did not function correctly. #12314. Fixed in 0.21.1.0.21.2./var/lib/vector to start correctly when the default data_dir of is used. #12413. Fixed in 0.21.2.0.21.2 adds a new option, load_timeout_secs that can be configured to a higher value.vector generate works again with the datadog_agent source. #12469. Fixed in 0.21.2.assume_role configuration for AWS components doesn’t cache the credentials, resulting in a high number of calls to AssumeRole. This was fixed in 0.22.0 via awslabs/smithy-rs#1296.nats sink and nats source now support TLS and authentication via username/password, JWT, Token,
NKey, and client certificate.redis source was added to complement the existing redis sink. It supports fetching data via
subscribing to a pub/sub channel or popping from a list.We are in the process of updating all Vector components with consistent instrumentation as described in Vector’s component specification).
With this release we have instrumented the following sources with these new metrics:
mongodb_metricspostgresql_metricssocketstatsdAs well as all transforms.
tls options can now be configured on AWS sinks. This is useful when using AWS compatible endpoints
where the certificates may not be trusted by the local store.The end-to-end acknowledgements configuration, acknowledgements, was moved from sources to sinks. When
set on a sink, all connected sources that support acknowledgements are configured to wait for the sink
to acknowledge before acknowledging the client. Setting acknowledgements on sources is now deprecated.
See the upgrade guide for more details.
vector tap and vector top have had a few enhancements.
vector tap:
--inputs-of--quiet to suppress
these messages.--meta flag was added to include metadata about which component the output events came from.Both vector top and vector tap now automatically reconnect if the remote Vector instance goes away.
This behavior can be disabled by passing --no-reconnect.
Initial support was added for ingesting traces from the Datadog Agent into Vector (via the
datadog_agent source) and forwarding them to the Datadog API (via the new datadog_traces sink). Note
that currently APM metrics are dropped and so you will be missing these statistics in Datadog if you
forward traces to it through Vector. We will be following up to add support APM metrics to Vector.
Datadog docs are forthcoming but the Agent configuration option, apm_config.apm_dd_url, can be used to
forward traces from the Datadog Agent to Vector.
scrape_interval_secs configuration option of the internal_metrics source can now be fractional
seconds.to_timestamp now accepts an optional unit argument to control how numeric unix timestamp
arguments are interpreted. For example, unit can be set to milliseconds if the incoming timestamps
are unix millisecond timestamps. It defaults to seconds to maintain current behavior.vector user is now added to the systemd-journal-remote group, if it
exists, to facilitate Vector being used to collect remote journald logs.loki sink now supports setting out_of_order_action to accept to instruct Vector to not modify
event timestamps. Vector would previously modify timestamps to attempt to satisfy Loki’s ordering
constraints, but these constraints were relaxed in Loki 2.4. If you are running Loki >= 2.4 it is
recommended to set out_of_order_action to accept to enable Vector to send data concurrently.All AWS components were migrated to the new AWS SDK from the end-of-life rusoto SDK. This new SDK supports IMSDv2 for authentication.
See the upgrade guide for more information.
journald source now supports a since_now option to instruct Vector to only fetch journal entries
that occur after Vector starts.blackhole sink can now be disabled via setting print_interval_secs to 0.route transform now has an _unmatched route that can be consumed to receive events that did not
match any of the other defined routes.ip_ntop and ip_pton VRL functions which can convert IPv6 addresses to and from their byte and
string representations.is_empty VRL function which returns whether the given object, array, or string is empty.splunk_hec source now accepts events on /services/collector. This route is an alias for
/services/collector/event.datadog_metrics sink now allows configuration of TLS via the standard tls options.aws_ec2_metadata transform now allows fetching the account-id field (this field must be opted
into).Additional options have been added to the aws_sqs source:
delete_message to control whether messages are deleted after
processing. This is useful for testing out the source.visibility_timeout_secs to control how long messages are locked for
before being rereleased to be processed again. Tuning this is useful for controlling how long
a message will be “sent” if a Vector instance crashes before deleting the message.These options mirror those that existed for the aws_s3 source for its SQS configuration.
The kubernetes_logs source has been rewritten to use the community supported
kube-rs library. We expect that this will resolve some long
outstanding bugs with Vector ceasing to process container logs. It also adds support for Kubernetes
authentication token rotation.
See the highlight for more details.
proxy configuration can now include username/password encoded into the URL of the proxy like
http://john:password@my.proxy.com.strlen function was added to VRL to complement the length function. The length function, when
given a string, returns the number of bytes in that string. The strlen function returns the number of
characters.Users can now provide dynamic label names to the loki sink via a trailing wildcard. Example:
labels:
pod_labels_*: {{ kubernetes.pod_labels }}
This is similar to the promtail configuration of:
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
replacement: pod_labels_$1
prometheus_scrape source that are sent to all
configured endpoints via the new query option. This is useful when using Vector with a federated
Prometheus endpoint.vector sink can now enable gzip compression by setting compression to true.splunk_hec source’s healthcheck that is exposed at /service/collector/health no longer requires
a HEC token. This matches the behavior of the Splunk forwarder and makes it easier to use with load
balancers that cannot set this header.batch.timeout configuration on sinks can now be include fractional seconds./health endpoint (mounted when api.enabled is true) now returns a 503 when Vector is
shutting down. This is useful when using a load balancer so that traffic is routed to other running
Vector instances.lua transform now returns an error at configuration load time if an unknown field is present on
hooks. This helps make typos more visible.docker_logs source now exits, shutting down Vector, if it hits
an unrecoverable deserialization error. Previously it would just
stall.region and endpoint to be
configured simultaneously. This is useful when using an AWS
compatible API.gcp_stackdriver_logs sink now recognizes a severity of ER as ERROR.0. This matches the behavior of division.journald source now flushes internal batches every 10 milliseconds, regardless of whether the
batch is full. This avoids an issue where the source would wait a very long time to send data downstream
when the volume was low but the batch size was configured high to handle spikes.geoip transform now avoids re-reading the database from disk randomly. This was unintended
behavior. We have an open issue for reloading the
database from disk during Vector’s reload process.-q or -qq). Previously running with a log level below INFO would cause some instrumentation
labels to be lost (like component_id).prometheus_exporter sink. Previously these were
not escaped and so resulted in invalid Prometheus export output that could not be scraped.vector-${VECTOR_VERSION}-${PLATFORM}.deb to vector_${VECTOR_VERSION}_${PLATFORM}-${REV}.deb. REV
is typically 1.kafka source now reads the incoming message as raw bytes rather than trying to deserialize it as
a UTF-8 string.aws_sqs source which would previously only acknowledge
the last message in each batch from SQS when acknowledgements were enabled.aws_sqs source would previously acknowledge events in SQS even if it failed to push them to
downstream components. This has been corrected.parse_xml function now correctly parses the node attributes for solo nodes which have no
siblings. Previously these node attributes were dropped.vector top now reports error metrics correctly again.buffer_received_events_total and buffer_received_bytes_total) now
include the counts from discarded events. This was done to match the component metrics and to support
future buffer on_full modes which may not discard events right away.The socket source when in udp mode would previously include the port of the remote address in the
enriched host field. This differed from the tcp mode where only the host part of the remote address
is enriched. Instead, this source now does not include the port in the enriched host field.
However, the socket source now has a port_key that can be set to opt into enrichment of the remote
peer port as part of the address.
Vector’s published docker images no longer include VOLUME declarations. Instead, users should provide
a volume at runtime if they require one. This avoids the behavior of Vector creating a volume for its
data directory even if it is unused.
See the upgrade guide for more information.
We are in the process of adding a source for ingesting data from the OpenTelemetry collector and OpenTelemetry compatible tools. We are starting with traces, since this has stabilized, but will move on to metrics and logs.
We’ll also be adding an OpenTelemetry sink for forwarding data from Vector to OpenTelemetry-compatible APIs.
At long last, support for iteration in VRL is almost ready. We expect it to be included in the next release.
See the RFC for a preview of how this will work.
VRL now has lexical scoping for blocks. This means that variables defined inside of a block in VRL (e.g.
an if condition block) are no longer accessible from outside of this block. This breaking change
was done to support VRL’s forthcoming iteration feature which requires it.
See the upgrade guide for how to migrate your VRL programs.
Sign up to receive emails on the latest Vector content and new releases
Thank you for joining our Updates Newsletter