Skip to content

feat: implemented caching and etag support for Go#6985

Open
akhilmhdh wants to merge 9 commits into
mainfrom
feat/go-cache
Open

feat: implemented caching and etag support for Go#6985
akhilmhdh wants to merge 9 commits into
mainfrom
feat/go-cache

Conversation

@akhilmhdh

@akhilmhdh akhilmhdh commented Jun 23, 2026

Copy link
Copy Markdown
Member

Context

This PR implements the following things in Go Sidecar

  1. Etag implementation
  2. Caching implementation
  3. The new handler.go pattern in which the spec only api functions exist in handler
  4. Test improvements for integration in which we group test by domain.

There cache keys will have a mismatch between computed for request as the serialization is different for both. This is good as it won't cause conflict.

Closes PLATFOR-386 and PLATFOR-473

Screenshots

Steps to verify the change

  1. Add the .env
GOLANG_SIDECAR_URL=http://host.docker.internal:4040
GO_SIDECAR_SHADOW_ENABLED=true
GO_SIDECAR_SHADOW_SAMPLE_RATE=75
  1. Just do secrets request and you will see the shadow is success
  2. Do make lint and make test and all should pass. Go etag and caching should work fine - but it should not be

Type

  • Fix
  • Feature
  • Improvement
  • Breaking
  • Docs
  • Chore

Checklist

  • Title follows the conventional commit format: type(scope): short description (scope is optional, e.g., fix: prevent crash on sync or fix(api): handle null response).
  • Tested locally
  • Updated docs (if needed)
  • Updated CLAUDE.md files (if needed)
  • Read the contributing guide

@akhilmhdh akhilmhdh requested a review from adilsitos June 23, 2026 09:19
@infisical-review-police

Copy link
Copy Markdown

💬 Discussion in Slack: #pr-review-infisical-6985-feat-implemented-caching-and-etag-support-for-go

Posted by Review Police — reviews, comments, new commits, and CI failures will stream into this channel.

@linear

linear Bot commented Jun 23, 2026

Copy link
Copy Markdown

PLATFOR-386

PLATFOR-473

@greptile-apps

greptile-apps Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds ETag support and encrypted Redis caching for the Go sidecar's secrets endpoints, refactors the secrets handler into a dedicated handler.go pattern, and improves the integration test structure by splitting a large test file into domain-focused modules.

  • ETag middleware (middlewares/etag.go) buffers responses and computes SHA-1 ETags for 2xx GET/HEAD replies, returning 304 when If-None-Match matches; it is mostly RFC 7232-compliant but does not handle the * wildcard or comma-separated ETag lists.
  • Secret cache service (secretcache/secretcache.go) stores permission-scoped, KMS-encrypted response payloads in Redis; however, the ETag lookup field omits PermissionRulesHash, so a role-permission change without a membership change can leave a stale ETag, causing clients to receive 304 when visible data has actually changed.
  • Node.js side calls invalidateSecretCacheByProjectId on environment create/update/delete to keep the Go cache consistent.

Confidence Score: 3/5

The core secret retrieval and permission logic is sound, but the ETag invalidation gap means permission rule changes may not immediately surface to clients using conditional GET caching.

The ETag hash field in secretcache omits PermissionRulesHash. When a role's permissions are modified without touching membership rows, the stored ETag survives and clients receive 304 Not Modified under the old, now-incorrect permission scope. The window lasts up to 15 minutes. The rest of the PR — KMS-encrypted cache payloads, jitter TTLs, clean handler decomposition, and Node.js cache invalidation hooks — is well-structured.

backend-go/internal/services/secrets/secretcache/secretcache.go — the listSecretsEtagField function needs PermissionRulesHash included in the field key.

Important Files Changed

Filename Overview
backend-go/internal/services/secrets/secretcache/secretcache.go New cache service for secrets with ETag support; ETag field key is missing PermissionRulesHash, which can cause stale 304 responses after role permission changes.
backend-go/internal/server/middlewares/etag.go New ETag middleware; missing If-None-Match: * wildcard handling and comma-separated ETag list support per RFC 7232.
backend-go/internal/server/api/secrets/secret/handler.go New secrets handler with clean dependency injection and ETag/cache integration.
backend-go/internal/server/api/secrets/secret/list_secrets.go Core list-secrets logic with permission filtering and cache integration; PersonalOverridesPriority deduplication key can collide on hyphenated secret names.
backend-go/internal/libs/cache/hash.go New SHA-256 URL-safe base64 hash utility; clean port of the TypeScript equivalent.
backend-go/internal/libs/jitter/jitter.go Simple jitter utility for cache TTL spreading using math/rand/v2.
backend-go/internal/services/permission/permission.go New GetPermissionFingerprint and PermissionRulesHash helpers added for cache keying.
backend-go/internal/keystore/keystore.go HashGet, HashSet, and PgGetIntItem added to KeyStore interface and implementation.
backend/src/services/project-env/project-env-service.ts Adds invalidateSecretCacheByProjectId calls on environment mutations to keep the Go cache consistent.

Comments Outside Diff (1)

  1. backend-go/internal/server/api/secrets/secret/list_secrets.go, line 209-228 (link)

    P2 PersonalOverridesPriority deduplication key can collide

    The map key is built as sec.Secret.Key + "-" + sec.Secret.FolderID.String(). Because secret keys can contain hyphens, this could collide for certain key/folderID combinations. Using sec.Secret.Key + "\x00" + sec.Secret.FolderID.String() (a separator never valid in a key), or a composite struct as the map key, eliminates the ambiguity.

Reviews (1): Last reviewed commit: "feat: reverted standalone change made an..." | Re-trigger Greptile

Comment thread backend-go/internal/services/secrets/secretcache/secretcache.go
Comment thread backend-go/internal/server/middlewares/etag.go
Comment thread backend-go/internal/server/middlewares/etag.go
Comment thread backend-go/internal/services/secrets/secretcache/secretcache.go

@adilsitos adilsitos left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments. I think the most imported one would be how the cache is being calculated

Comment on lines +69 to +71
cleanup := func() {
s.KMS.Close()
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can become too big if we start adding new cleanup. Do you see any problem if this is a method for the Service struct where we can call a cleanup operation? This way, on the main.go we can call this from the service instance.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion!!

client redis.UniversalClient
type keyStore struct {
redis redis.UniversalClient
db pg.DB

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Here we define db, but this is actually Postgres right? Maybe we could have it as postgres instead of db, since the interface supports any db, so if in the future we need another db I believe it is more straightforward

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about another database. I used pg.DB because of two reasons.

  1. This is something used everywhere in the app. So most of them will get it as pg itself.
  2. I feel it would be strange to call is pg.Postgres

Comment on lines +129 to +133
err := k.db.Replica().QueryRow(ctx, `
SELECT "integerValue"
FROM key_value_store
WHERE key = @key
AND ("expiresAt" IS NULL OR "expiresAt" > NOW())

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: one pattern that I see a lot on our knex queries is that if we receive a transaction we hit the primary database instead of the replica.

I saw the usage of this and it seems taht we are always hitting the replica (pgGetIntItem on keystore.ts) but the increment (pgIncrementBy.ts can receive a transaction, which will bump the wrong one). In my opinion, this should always query the main database, so we know that a cache was invalidated.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that we are going to use this to save items in memory right? Not sure if it is necessary, but since we are using go we could have a go routine to check the expiry items, and invalidate them. This would be lightweight, and we could have a low frequency of when this can be done.

@@ -49,13 +58,21 @@ func (m *MemoryKeyStore) GetItem(_ context.Context, key string) (string, error)
func (m *MemoryKeyStore) SetExpiry(_ context.Context, key string, expiry time.Duration) (bool, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels confusing to me that the setexpiry is the only method that applies changes on both hashes and items. Maybe we could have it with a different name? Or have an expiry for each field, one for hashes the other for items.

Comment on lines +214 to +216
resp := NewListSecretsRawV3ResponseData(&ListSecretsRawV3Response{
Secrets: []SecretRaw{},
})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also returning the empty body, shouldn't we return only the headers with the 304 status code?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the cache problems. The cache being created by node is different thatn the one created by go, that is why I faced those problems. Not sure if this is intended, but maybe they should be the same? Just something we can keep in mind

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from claude: reason for the difference between the cachekey on nodejs and go:

  The two never agree, so a client moving between the Go and Node backends
  (or Go reading an ETag Node wrote) will never get a 304, and neither
  will hit the other's cached payload.

  1. requestParamsHash diverges — this breaks the ETag field and the cache
  key

  This is the most likely thing you're actually seeing. The ETag is stored
  in Redis under field actorId:permissionFingerprint:requestParamsHash.
  That hash is computed completely differently:

  Go (list_secrets.go:102) passes a map[string]any to GenerateHash →
  json.Marshal. Go's json.Marshal sorts map keys alphabetically. Node's
  JSON.stringify uses insertion order. So even with identical keys/values
  the JSON strings differ → different SHA → different field.

  But the keys aren't even identical:

  Go map: environment, path, recursive, includeImports,
    expandSecretReferences, viewSecretValue, personalOverridesBehavior,
    tagSlugs, metadataFilter
  Node object: environment, path, recursive, includeImports,
    expandSecretReferences, expandPersonalOverrides,
    personalOverridesBehavior, secretImportReferencesBehavior,
    viewSecretValue, throwOnMissingReadValuePermission, ...params

Comment on lines +12 to +18
func GenerateHash(data any) string {
jsonBytes, err := json.Marshal(data)
if err != nil {
return ""
}
return HashBytes(jsonBytes)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should return the error here? Or maybe even panic this.

Comment on lines +843 to +844
if args.ActorType == auth.ActorTypeIdentity {
apActorCondition = `membership."actorIdentityId" = addlPriv."actorIdentityId"`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got this with the help of claude, but it seems to make sense. We had a similar problem with secret access, where we were not checking users inside projects though a group.

     "summary": "GetPermissionFingerprint joins additional_privileges on
  a column-to-column predicate (membership.actorUserId =
  addlPriv.actorUserId) instead of the literal @actorID used by the fixed
  getPermission query — so it misses additional-privilege changes for
  actors whose only project access is via a group, leaving the ETag
  fingerprint unchanged.",


      "failure_scenario": "A user whose project access is solely through a
  group is granted (or has revoked) an additional privilege. No secret
  write occurs, so invalidateSecretCacheByProjectId is never called and
  the ETag hash is not deleted. Group-derived membership rows have
  actorUserId = NULL, so this join drops the privilege from the
  fingerprint and it stays identical. On the next ListSecrets with
  If-None-Match, the 304 fast-path (secretcache.go:86-96) — which keys
  purely on actorID:fingerprint:paramsHash — matches the stored ETag and
  returns 304 Not Modified. The client keeps a stale secret list
  reflecting the old privilege for up to the 15m ETag TTL / UTC-day
  boundary. The sibling getPermission query (line 598-602) was
  deliberately changed to the literal form for exactly this case, and the
  TS reference (permission-dal.ts:976) also uses the literal actorId — the
  fingerprint port did not get the fix."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants