fix: CRD update mutations return 400 on JSON-encoded gRPC path#1279
Open
craigmarker wants to merge 10 commits into
Open
fix: CRD update mutations return 400 on JSON-encoded gRPC path#1279craigmarker wants to merge 10 commits into
craigmarker wants to merge 10 commits into
Conversation
gogo jsonpb (used by YARPC's JSON codec) handles google.protobuf.Timestamp as a well-known type but does not recognize metav1.Time, which also uses RFC3339 string encoding. When the JSON body contains creationTimestamp or managedFields[*].time, gogo falls through to its generic struct handler which expects a JSON object — failing with "cannot unmarshal string into Go value of type map[string]json.RawMessage". TriggerRunSpec and TriggerRunStatus already had this fallback pattern: try jsonpb first, and on the specific error fall back to encoding/json, which correctly dispatches to metav1.Time.UnmarshalJSON. TriggerRun itself was missing these methods, so the error propagated to YARPC, causing UpdateTriggerRun to return 400 on any request that included ObjectMeta. Add MarshalJSON/UnmarshalJSON to TriggerRun following the same pattern, and fix the crd.tmpl generator template so all future CRD types get them. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous commit added UnmarshalJSON, but gogo's jsonpb v1.3.2 does not check for json.Unmarshaler on nested message fields — it only dispatches to its own JSONPBUnmarshaler interface (UnmarshalJSONPB). Since YARPC uses gogo's jsonpb to decode JSON-encoded gRPC requests, the UnmarshalJSON method was never called for the TriggerRun field inside UpdateTriggerRunRequest. Add UnmarshalJSONPB to the top-level CRD type, which goes straight to encoding/json (bypassing jsonpb to avoid infinite recursion). This is the method gogo actually calls when processing nested messages. The test sends an UpdateTriggerRunRequest through jsonpb.Unmarshal with an RFC3339 creationTimestamp — the same codepath YARPC takes when the Connect/gRPC-web bridge forwards a JSON-encoded request from the UI. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Add UnmarshalJSONPB to TriggerRun (and crd.tmpl for all generated CRD types) so gogo's jsonpb dispatches to our handler instead of trying to decode metav1.ObjectMeta fields itself. gogo v1.3.2 does not know about metav1.Time and fails when it receives an RFC3339 string for creationTimestamp — it expects a JSON object with seconds/nanos. This error surfaces as a 400 on all CRD update mutations called through the browser UI, which goes through Envoy's connect_grpc_bridge filter and sends JSON-encoded gRPC. The handler pre-processes int64/uint64 fields (proto3 canonical JSON encodes these as quoted strings that encoding/json cannot decode) and then delegates to encoding/json via a type alias to break the dispatch cycle. metav1.Time is handled natively by metav1.Time.UnmarshalJSON when encoding/json is the decoder — no conversion step needed. Add CollectInt64Fields and UnquoteInt64Fields to kubeproto/util and wire them into both the generated TriggerRun type and the crd.tmpl template. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Without this fix, UpdateTriggerRun succeeded JSON decoding but then
failed when controller-runtime serialized the object to PUT to the
Kubernetes API server. The Kubernetes runtime codec calls json.Marshal
which was delegated to our MarshalJSON using jsonpb. jsonpb does not
handle json:",inline" on TypeMeta, so it produced {"typeMeta":{"kind":
"TriggerRun",...}} — Kubernetes requires "kind" at the root level.
Switch MarshalJSON to use encoding/json with a type alias (same
pattern as UnmarshalJSONPB). encoding/json respects json:",inline",
flattening kind/apiVersion to the root. Nested Spec/Status still
delegate to their own MarshalJSON implementations (jsonpb), preserving
proto3 field names and enum strings for those fields.
Update crd.tmpl so all generated CRD types get the same fix.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Commit 51f759c was intended to swap the jsonpb-based MarshalJSON for an encoding/json version, but it accidentally reverted the entire block (UnmarshalJSONPB + MarshalJSON + UnmarshalJSON) in trigger_run.pb.go and reset crd.tmpl to main, leaving the test without a fix and the util.go helpers as dead code. Restore both methods: - UnmarshalJSONPB: gogo's jsonpb dispatches to this on nested messages. Pre-processes int64/uint64 quoted strings, then delegates to encoding/json (which handles metav1.Time.UnmarshalJSON correctly for RFC3339). - MarshalJSON: uses encoding/json so json:",inline" on TypeMeta produces "kind"/"apiVersion" at the root level, which Kubernetes requires. Also correct the test doc comment, which described UnmarshalJSON (an earlier approach) rather than UnmarshalJSONPB (what actually landed). Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
🔍 Go Lint & TODO Tracking Results
|
Go Coverage Report (Bazel)Total Coverage: 63.9% Coverage Policy:
|
Use strconv.ParseInt/ParseUint instead of a manual digit loop. Rename CollectInt64Fields param to structType (it's the root struct to walk, not an int64 type). Drop the redundant type annotation on the sync.Map var. Trim comments that referenced the client transport or narrated no-op behavior. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
🔍 Go Lint & TODO Tracking Results
|
Without a custom MarshalJSON, encoding/json handles TriggerRun directly and correctly inlines TypeMeta (kind/apiVersion at root) for Kubernetes writes via json:",inline". gogo's jsonpb marshals GET/LIST responses natively, producing the nested typeMeta form the proto3 client expects. The encoding/json-based MarshalJSON was added to fix a Kubernetes write failure, but that failure was only introduced by the preceding jsonpb- based MarshalJSON. The correct fix is no custom MarshalJSON at all. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
🔍 Go Lint & TODO Tracking Results
|
The generic walker with a match predicate was designed to be shared between CollectMetav1TimeFields and CollectInt64Fields, but CollectMetav1TimeFields was removed. With a single caller the abstraction adds more reading cost than it saves — replace it with a specific private function. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
🔍 Go Lint & TODO Tracking Results
|
prefix and visited are recursion-internal details that the caller had no business setting. Replace the separate helper with a closure that captures both, leaving only the recursive arguments (typ, prefix) visible inside the walk. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
🔍 Go Lint & TODO Tracking Results
|
🔍 Go Lint & TODO Tracking Results
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this? (check all applicable)
What changed?
Add
UnmarshalJSONPBandMarshalJSONto the top-level CRD struct incrd.tmpl(affecting all generated CRD types) and toTriggerRunspecifically. AddCollectInt64FieldsandUnquoteInt64Fieldstokubeproto/utilfor pre-processing proto3 canonical JSON before delegating toencoding/json.Why?
All CRD update mutations (UpdateTriggerRun, UpdatePipeline, etc.) returned 400 Bad Request when called through the browser UI. The OSS UI uses connect-es with
createConnectTransport, which routes requests through Envoy'sconnect_grpc_bridgefilter as JSON-encoded gRPC. YARPC uses gogo'sjsonpbto decode these requests.Two distinct failures:
Unmarshal 400: gogo's jsonpb v1.3.2 dispatches to
JSONPBUnmarshaler(notjson.Unmarshaler) on nested message fields. WithoutUnmarshalJSONPBon TriggerRun, gogo tried to decodemetav1.Time(embedded in ObjectMeta) itself and failed — proto3 canonical JSON encodes timestamps as RFC3339 strings, but gogo expects{"seconds": N, "nanos": N}. Additionally, proto3 encodesint64/uint64as quoted strings thatencoding/jsoncan't decode without pre-processing.UnmarshalJSONPBpre-processes both cases and then delegates toencoding/jsonvia a type alias to avoid infinite recursion.Marshal / Kubernetes write failure: The pre-existing jsonpb-based
MarshalJSONon CRD types did not respectjson:",inline"on TypeMeta, producing{"typeMeta": {"kind": "TriggerRun", ...}}. The Kubernetes API server requireskindandapiVersionat the root level, so every update mutation that made it past deserialization would fail when controller-runtime serialized the object for the PUT request. ReplacingMarshalJSONwith anencoding/json+ type-alias approach fixes this, while nestedSpec/Statusfields continue to use their own jsonpb-basedMarshalJSONimplementations.How did you test it?
I ran a unit test (
trigger_run_json_test.go) that callsjsonpb.Unmarshaldirectly onUpdateTriggerRunRequestwith an RFC3339creationTimestamp— the exact codepath YARPC takes. The test fails onmainand passes with the fix.I confirmed end-to-end against a local sandbox:
UpdateTriggerRunvia Connect+JSON through Envoy using a Python script; apiserver logs confirmed"encoding":"json"and the request reaching the handler without error. The action was persisted in Kubernetes.createConnectTransport→ Connect+JSON encoding), created a running cron trigger viama sandbox demo pipeline, and clicked Kill in the UI. Devtools showedUpdateTriggerRunreturning 200.Potential risks
MarshalJSONon CRD types now produceskind/apiVersioninline at the root rather than nested undertypeMeta. This is required by Kubernetes but changes the JSON shape in GET/LIST responses served over the JSON encoding path. The connect-es TypeScript client does not consumetypeMetafrom responses so UI behavior is unaffected. Any external consumer readingtypeMetafrom the JSON API would need to adapt.Release notes
Fix: UpdateTriggerRun and all other CRD update mutations returned 400 when called through the browser UI. The OSS UI uses JSON-encoded gRPC via Envoy, which was incompatible with how gogo's jsonpb decoded metav1.Time timestamps and how it serialized TypeMeta for Kubernetes writes.
Documentation Changes
None.