Skip to content

feat: add configHash field to Cluster (#25311)#27657

Open
matmil-dev wants to merge 5 commits intoargoproj:masterfrom
matmil-dev:feature/25311-cluster-version
Open

feat: add configHash field to Cluster (#25311)#27657
matmil-dev wants to merge 5 commits intoargoproj:masterfrom
matmil-dev:feature/25311-cluster-version

Conversation

@matmil-dev
Copy link
Copy Markdown

@matmil-dev matmil-dev commented May 3, 2026

Closes #25311.

As discussed in issue #25311, this PR adds a configHash field to the Cluster structure which would aid external observers in detecting changes to Cluster configurations, since watching the respective Cluster API response is not a reliable approach due to certain sensitive fields being redacted.

Pending further discussion on the issue.

As I am a first-time contributor, I am not sure of how this feature fits into the "feature status" framework. I suppose it should be marked as alpha, but I am not sure the scope of the changes is large enough to even warrant this, so I would appreciate guidance.

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • The title of the PR states what changed and the related issues number (used for the release note).
  • The title of the PR conforms to the Title of the PR
  • I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • Does this PR require documentation updates?
  • I've updated documentation as required by this PR.
  • I have signed off all my commits as required by DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My build is green (troubleshooting builds).
  • My new feature complies with the feature status guidelines.
  • I have added a brief description of why this PR is necessary and/or what this PR solves.
  • Optional. My organization is added to USERS.md.
  • Optional. For bug fixes, I've indicated what older releases this fix should be cherry-picked into (this may or may not happen depending on risk/complexity).

@matmil-dev matmil-dev requested review from a team as code owners May 3, 2026 05:36
@bunnyshell
Copy link
Copy Markdown

bunnyshell Bot commented May 3, 2026

🔴 Preview Environment stopped on Bunnyshell

See: Environment Details | Pipeline Logs

Available commands (reply to this comment):

  • 🔵 /bns:start to start the environment
  • 🚀 /bns:deploy to redeploy the environment
  • /bns:delete to remove the environment

@codecov
Copy link
Copy Markdown

codecov Bot commented May 3, 2026

Codecov Report

❌ Patch coverage is 75.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.83%. Comparing base (d678193) to head (47c01f9).

Files with missing lines Patch % Lines
pkg/apis/application/v1alpha1/types.go 76.92% 2 Missing and 1 partial ⚠️
util/hash/hash.go 71.42% 1 Missing and 1 partial ⚠️
server/cluster/cluster.go 66.66% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master   #27657   +/-   ##
=======================================
  Coverage   63.83%   63.83%           
=======================================
  Files         418      418           
  Lines       57237    57257   +20     
=======================================
+ Hits        36537    36552   +15     
+ Misses      17292    17291    -1     
- Partials     3408     3414    +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@matmil-dev matmil-dev force-pushed the feature/25311-cluster-version branch from d6b2370 to 47c01f9 Compare May 3, 2026 18:11
Copy link
Copy Markdown
Contributor

@ppapapetrou76 ppapapetrou76 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have few concerns / questions regarding the overall implementation

  1. the term generation in k8s is used as a increasing integer tied to the persisted spec.. Here the code uses a field with the same name as a content hash living in ephemeral cache. So this semantically misleading and I would suggest to change the field nane to something like identityHash or configHash etc.
  2. this PR changes the publc API so this should be documented in the upgrade notes IMHO
  3. I'm also worreid about placing this field in ClusterInfo and not in the Cluster spec itself. The ClusterInfo is a cached-layer that resets so I'm wondering if we really want to put it here.

Comment thread pkg/apis/application/v1alpha1/types.go Outdated
Comment thread server/cluster/cluster.go Outdated
Comment thread util/hash/hash.go
Copy link
Copy Markdown
Contributor

@ppapapetrou76 ppapapetrou76 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more thing that I missed in my previous review

cluster.Info.Generation is never populated from the DB because Info is not persisted in the Kubernetes secret. The cluster.Info.Generation + 1 fallback in getUpdatedClusterInfo is always equal to 0 + 1 = 1,

With that said the test named falls back to existing generation plus one when cluster has existing generation tests something that cannot be reproduced in production

@matmil-dev
Copy link
Copy Markdown
Author

Thanks for the quick review!

I have few concerns / questions regarding the overall implementation

  1. the term generation in k8s is used as a increasing integer tied to the persisted spec.. Here the code uses a field with the same name as a content hash living in ephemeral cache. So this semantically misleading and I would suggest to change the field nane to something like identityHash or configHash etc.

Yep, I mentioned this in a comment on the issue, I'll paste it here for posterity:

The proposal was to add a resourceVersion field to the Cluster API response. However, if we go by the Kubernetes semantics of the equivalent resourceVersion field, it is updated every time the resource changes, including metadata-only changes. This means actions such as ArgoCD performing a refresh of the cluster would cause an update of the field which does not seem to align with the intention behind this change.
Hence, I named the field generation after another already existing metadata field in Kubernetes whose behavior more closely aligns with tracking changes to only the desired state/config of the resource. There's the semantic issue of the values of generation in Kubernetes being sequential (and the name of the field somewhat implying the sequential nature), while here that would not be the case, so I would like to discuss this. I suppose there's also the possibility that there's no need to fully synchronize with what the fields mean in k8s, in which case this point is reduced to preference.

I'm fine with whichever name seems the most suitable. I'm partial to your *Hash proposals.

  1. this PR changes the publc API so this should be documented in the upgrade notes IMHO

Will make the change. I presume this would end up in 3.5?

  1. I'm also worreid about placing this field in ClusterInfo and not in the Cluster spec itself. The ClusterInfo is a cached-layer that resets so I'm wondering if we really want to put it here.

My motivation for that was that it seemed to fit in with the other information in ClusterInfo such as the serverVersion and dynamic values such as connectionState, however if it makes more sense for it to end up in the Cluster itself and have the value persisted in the underlying cluster secret, I am not opposed to that. This would also resolve your comment about the fallback not making much sense.

Some other open questions I left on the issue which I would also like to surface here:

  • The field's value is currently a hash of the cluster ID, name, server address and config fields. However, I would like to discuss which fields exactly should be included in the hash calculation, as there are some other candidates for inclusion such as namespaces or project, but I'm not sure where to draw the line.
  • Should this value be exposed on the cluster settings UI?

matmil-dev added 5 commits May 6, 2026 03:14
Signed-off-by: Mateja Milošević <mateja@matmil.dev>
Signed-off-by: Mateja Milošević <mateja@matmil.dev>
Signed-off-by: Mateja Milošević <mateja@matmil.dev>
Signed-off-by: Mateja Milošević <mateja@matmil.dev>
Signed-off-by: Mateja Milošević <mateja@matmil.dev>
@matmil-dev matmil-dev force-pushed the feature/25311-cluster-version branch from 47c01f9 to a4abd64 Compare May 6, 2026 01:52
@matmil-dev matmil-dev changed the title feat: add generation field to ClusterInfo (#25311) feat: add configHash field to Cluster (#25311) May 6, 2026
@matmil-dev
Copy link
Copy Markdown
Author

I've pushed an update to change over to a configHash field on the Cluster object itself. The hash is persisted in the underlying secret, however I have some concerns about this.

If the backing secret is edited directly, either using kubectl or some other mechanism, the system, as currently architected, has no opportunity to persist a new value for the hash immediately after the edit. Edits made through other means are fine and a new hash is immediately calculated and persisted. This effectively means that the secret itself will contain a stale value for the hash (if the edits made to the secret involved the config), even though a correct updated value will be disseminated to consumers using the API or CLI (as per the current implementation which recalculates the hash on every secretToCluster). The stale value would be replaced with the correct one if any programmatic update calls are made at any point, though, as the updates call clusterToSecret.

I personally dislike this behavior, but I am not sure it's avoidable without making major changes to the cluster secret persistence flow to allow rewriting the secret directly after an edit is observed via the ClusterInformer. I suppose one other option would be to not persist the hash at all and always calculate it on the fly (but still have it inside Cluster), which sounds fine to me, but I don't know whether it is important to Argo users to be able to read state such as this hash directly from the cluster secret. Maybe this is something to discuss in the maintainer meeting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[API] Cluster GET does not return complete configuration

2 participants