-
Notifications
You must be signed in to change notification settings - Fork 28
feat: remove agents on AgentDisconnect message #554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
PTAL @elastic/ingest-otel-data |
switch attr.GetKey() { | ||
case semconv.AttributeServiceName: | ||
serviceParams.Name = attr.GetValue().GetStringValue() | ||
case semconv.AttributeDeploymentEnvironment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it (or should it be):
deployment.environment
(as used here),service.environment
(as listed in the integration test below), ordeployment.environment.name
(the latest name in semconv, v1.27.0 deprecateddeployment.environment
: https://github.com/open-telemetry/semantic-conventions/blob/02681893953519164eefd086df9e20c7b15a159d/CHANGELOG.md?plain=1#L307)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question! The best option is to use the most up-to-date semantic convention (semconv) attribute that represents the deployment environment name. So, if all the OpenTelemetry SDKs we plan to support with this extension already use deployment.environment.name
, then that’s the attribute we should go with.
I am going to change the expected value to the new deployment.environment.name
, as well the integration tests and docs for clarity. Note that Kibana, either if the Otel SDK is providing deployment.environment
or deployment.environment.name
, it will store it in the apm config index with the service.name
keyword (due classic SDKs). The following is an example of a Otel Java SDK config:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": ".apm-agent-configuration",
"_id": "abc",
"_score": 1,
"_source": {
"agent_name": "opentelemetry/java/elastic",
"service": {
"name": "dice-java-5",
"environment": "prod-otel"
},
"settings": {
"send_logs": "true",
"send_traces": "true",
"deactivate_instrumentations": "that-module-please"
},
"@timestamp": 1746175386213,
"applied_by_agent": false,
"etag": "abc"
}
}
]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switched to the new deployment.environment.name
attribute 5c39db9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that Kibana, either if the Otel SDK is providing
deployment.environment
ordeployment.environment.name
, it will store it in the apm config index with theservice.name
keyword (due classic SDKs).
You mean service.environment
in that sentence, I believe.
switched to the new
deployment.environment.name
attribute
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use deployment.environment
for iOS and Android agents. The iOS one will switch from using Central Config to using OpAMP afaik. Should we add a case for deployment.environment
to keep backwards compatibility? cc @bryce-b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks. I think for Android we can just start sending the new attribute when we enable OpAMP, since its current version doesn't support central config anyway. For iOS, I think we could send both attributes so that people can use the same build with both central config and OpAMP? Today we have the agent's weekly meeting, let me ask for some advice there in case I might me missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good @LikeTheSalad. Because exposing an OpAMP endpoint, this extension should only be used for OTel SDKs (legacy SDKs can use the elasticapmreceiver which exposes an intake v2 endpoint), does the iOS OTel SDK use central config already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the iOS OTel SDK use central config already?
It does.
legacy SDKs can use the elasticapmreceiver which exposes an intake v2 endpoint
Does this mean that elasticapmreceiver can act as an APM Server replacement, in that it provides central config too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that elasticapmreceiver can act as an APM Server replacement, in that it provides central config too?
Yes, that's the idea. These are the http endpoints it exposes: "/config/v1/agents" and "/intake/v2/events".
As mentioned, it looks like the iOS SDK is using the deployment.environment
resource attribute, thus it would make sense to have back-wards compatibility if we want OpAMP to be a drop-in replacement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the weekly meeting, the new SDKs should use the new name when adding support for OpAMP config. And I think that if the elasticapmreceiver
will always be available, then I don't see why we should make the OpAMP route backwards compatible. In terms of an agent that should support both, the APM Server and the OTel Collector routes, I'm only aware of the iOS one, so I'll loop @bryce-b in just in case, though it seems like relying on the elasticapmreceiver
should do.
The goal of this PR was to remove entries from the central configuration fetcher map
uidToService map[string]agentcfg.Service
, but the agent's service catching logic was moved to the already existing sync.Map in the OpAMP callbacks → the fetcher does not need to deal with AgentDisconnect messagesdeployment.environment.name
instead of the deprecateddeployment.environment
attribute.Fixes #398