Skip to content

Localization: String Catalog pipeline (plurals + catalog generation)#25688

Open
jkmassel wants to merge 4 commits into
trunkfrom
jkmassel/xcstrings-glotpress-pipeline
Open

Localization: String Catalog pipeline (plurals + catalog generation)#25688
jkmassel wants to merge 4 commits into
trunkfrom
jkmassel/xcstrings-glotpress-pipeline

Conversation

@jkmassel

@jkmassel jkmassel commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Moves WordPress-iOS localization toward String Catalogs, in two stages — plurals through GlotPress, and a build-free genstrings replacement. The plural round-trip is wired into the release flow (failsafe), running alongside the app-strings pipeline; catalog generation runs as a build-free CI coverage gate. Nothing is read at runtime yet; the cutover is a separate migration.

Summary

  • Plurals through GlotPress — addresses Add support for plurals in translations #6327 (open since 2016). A legacy .strings file can't represent CLDR plural categories; a String Catalog can. Plurals ride the main app GlotPress project (no separate project), and the round-trip needs no app build.
  • Build-free catalog generation — replaces genstrings with Apple's xcstringstool, keeping AppLocalizedString and its ~379 call sites unchanged, with a CI gate proving the catalog loses no string genstrings finds.

Stage 1 — Plurals

fastlane/lanes/localization_plurals.rb, plural_strings_helper.rb, WordPress/Classes/Plurals.xcstrings

Plurals are authored in Plurals.xcstrings (English one/other) and carried through the main app GlotPress project as flat strings keyed <key>|==|plural.<cldr-category> — the same id Apple's xcodebuild -exportLocalizations uses — so no plural-rules map is needed and every locale is covered, including Welsh (6 forms), which the gettext/.stringsdict path can't.

  • Forward (no build): flat originals are merged into Localizable.strings (like MANUALLY_MAINTAINED_STRINGS_FILES) so they upload with the app strings. One original per CLDR category — the full set (zero/one/two/few/many/other), unconditionally; English fills the categories it doesn't itself distinguish. No locale list is read here: the union of categories over any real locale set is always all six (Arabic/Welsh use them all), and over-emitting is harmless — the reverse folds only the categories each locale actually needs. Fails loudly if a plural is missing its English other.
  • Reverse (no app build): flat keys are read back out of the downloaded Localizable.strings and folded straight into the catalog JSON. Each cell is human ?? AI ?? English source; AI / English-fallback cells are flagged needs_review. The flat keys stay in Localizable.strings as harmless, unused-at-runtime entries — like the merged infoplist.* keys.
  • Per-locale CLDR map — derived, not committed. Folding back needs to know which categories each locale takes (jaother; plone/few/many/other). The reverse derives that map fresh at fold time from Apple's exporter — but run against a throwaway one-plural Swift package, not the app, so it's a few-second xcodebuild -exportLocalizations over an SPM stub with no app build or bootstrap. The categories are a property of the locale's CLDR, not of our strings, so the stub yields the same per-locale sets as exporting the whole app. Nothing is committed and nothing is cached, so the map can't lag the ship-locale list (33 today) or Apple's CLDR. (Apple's shipped CLDR gives es/fr/it/pt = one, other, no many — exactly why this isn't hand-authored.)
  • .strings reading is delegated to the release toolkit's read_strings_file_as_hash (plutil) — no hand-rolled parsing. PluralStrings is pure Ruby and unit-tested. The AI tier is a stub today (nil → English fallback); the hook and needs_review flagging are in place for when it's wired.

The two entries in Plurals.xcstrings (blogging.reminders.weeklyCount, editor.textCounter.wordCount) are seed fixtures to exercise the round-trip end-to-end — they aren't referenced by any call site yet (the legacy formatters still do their own count-based logic), so real plurals get wired (or these seeds replaced) at the cutover.

Stage 2 — Catalog generation

fastlane/lanes/localization_catalog.rb, catalog_helper.rb

  • generate_strings_catalog: xcstringstool extract --legacy-localizable-strings --modern-localizable-strings -s AppLocalizedStringsync into Localizable.xcstrings. No app build. -s AppLocalizedString is the same flag genstrings uses today, so the custom routine and its ~379 call sites are untouched. Each extract chunk gets its own output dir, so same-basename .stringsdata don't overwrite each other.
  • xcstringstool sync doesn't reconcile a key whose English source value changed; the lane re-derives current English from a fresh extraction and flips changed keys' translations to needs_review (walking device/width variations, not just flat units). This is staged for cutover — with the catalog gitignored and regenerated fresh each run, there are no persisted translations to re-flag yet.
  • Coverage gate (verify_strings_catalog, CI): runs genstrings over the same source and fails on any key the catalog is missing — proving the build-free extraction loses nothing. Both sides read via read_strings_file_as_hash.

Testing instructions

All checks below are build-free — no full rake dependencies, just the gems (bundle install) and Xcode (for xcstringstool / genstrings). No app build, SPM resolve, Gutenberg clone, or secrets. Step 1 writes only a gitignored artifact and step 3 writes nothing, so git status stays clean.

1 — Build-free catalog extraction (Stage 2). Generate the catalog from source:

bundle exec fastlane ios generate_strings_catalog

Generated Localizable.xcstrings with 3923 keys (…), ~1 minute. The genstrings replacement with no app build; writes the (gitignored) WordPress/Resources/Localizable.xcstrings.

2 — Coverage gate (Stage 2). Prove the build-free extraction loses nothing vs. genstrings:

bundle exec fastlane ios verify_strings_catalog

Localizable.xcstrings covers all 3923 genstrings keys. (otherwise it fails non-zero and lists the missing keys — this is the gate CI runs).

To prove the gate isn't vacuous, delete any entry from the just-generated Localizable.xcstrings and re-run — it exits 1 naming the gap (MISSING from catalog: "…"). Re-running step 1 regenerates the key and the gate passes again (the catalog is a throwaway artifact).

3 — Plural forward (Stage 1). Expand the seed plurals into the GlotPress flat originals:

bundle exec fastlane ios generate_plural_strings_for_glotpress

Generated 12 flat plural originals from 2 catalog keys. followed by a printed Sample:. The lane writes nothing — it returns the originals as a string; at code freeze generate_strings_file_for_glotpress writes them to a temp file and merges them into Localizable.strings. Each seed plural expands to one "<key>|==|plural.<category>" entry per CLDR category — the full set (zero/one/two/few/many/other), English filling the categories it doesn't itself distinguish.

Not covered above (and why): the plural reverse fold (download_localized_plurals) needs a GlotPress round-trip, so it's exercised by the unit suite and runs at code freeze. Its per-locale CLDR map is derived inline from a throwaway one-plural Swift package — a few-second xcodebuild -exportLocalizations over an SPM stub, no app workspace or bootstrap — so there's nothing committed to verify here.

Status

  • Catalog generation + coverage gate at scale — 3,923 keys, 0 missing vs genstrings (steps 1–2)
  • Plural forward — 2 keys → 12 flat originals across the full CLDR set (step 3)
  • Reverse fold → per-locale variations (build-free); per-locale CLDR map derived inline from a throwaway fixture over all 33 ship locales
  • PluralStrings / CatalogHelper unit suites pass (local harness, pure Ruby); lanes load; rubocop clean
  • Plural lanes wired into generate_strings_file_for_glotpress / download_localized_strings, failsafe (logged, never raised)
  • Catalog coverage gate runs as a standalone CI step (verify-strings-catalog.sh) — hard-fails on any gap
  • First real GlotPress round-trip through the main project (post-merge, at code freeze)
  • Runtime cutover — switch call sites to the catalog, retire legacy Localizable.strings

Notes

  • Plurals.xcstrings lives in a synchronized folder (WordPress/Classes/) so it auto-joins the target (compiles to .stringsdict) and is visible to -exportLocalizations; WordPress/Resources/ is explicitly-referenced (non-synced).
  • The per-locale CLDR map is derived from a throwaway one-plural Swift package, not the app — so its -exportLocalizations is a few-second SPM export with no Mac Catalyst build, sidestepping the binary-dependency slice problems a full-app export hits (e.g. the Zendesk xcframework's missing maccatalyst slice).

Not in this PR (deliberate)

  • The runtime cutover — retiring Localizable.strings, the catalog as the committed backing store, unifying Plurals.xcstrings into the one catalog.
  • The AI translation backend — the floor hook is wired (human ?? AI ?? English), but the service is a stub.
  • Generated outputs (Localizable.xcstrings, Plurals.strings) — CI-generated, gitignored, not committed.
  • Gutenberg inclusion (lane takes a gutenberg_path: param) and other-table catalogs.

@dangermattic

dangermattic commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator
1 Warning
⚠️ This PR is larger than 500 lines of changes. Please consider splitting it into smaller PRs for easier and faster reviews.

Generated by 🚫 Danger

@wpmobilebot

wpmobilebot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor
App Icon📲 You can test the changes from this Pull Request in WordPress by scanning the QR code below to install the corresponding build.
App NameWordPress
ConfigurationRelease-Alpha
Build Number32833
VersionPR #25688
Bundle IDorg.wordpress.alpha
Commit61f521b
Installation URL7f9cjfiqfuo2o
Automatticians: You can use our internal self-serve MC tool to give yourself access to those builds if needed.

@wpmobilebot

wpmobilebot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor
App Icon📲 You can test the changes from this Pull Request in Jetpack by scanning the QR code below to install the corresponding build.
App NameJetpack
ConfigurationRelease-Alpha
Build Number32833
VersionPR #25688
Bundle IDcom.jetpack.alpha
Commit61f521b
Installation URL622jp38sj26b0
Automatticians: You can use our internal self-serve MC tool to give yourself access to those builds if needed.

@jkmassel jkmassel added Tooling Build, Release, and Validation Tools [Type] Enhancement labels Jun 24, 2026
@jkmassel jkmassel self-assigned this Jun 24, 2026
@jkmassel jkmassel force-pushed the jkmassel/xcstrings-glotpress-pipeline branch 8 times, most recently from 1518c74 to 42e2159 Compare June 25, 2026 00:23
@jkmassel jkmassel added this to the 27.1 milestone Jun 25, 2026
@jkmassel jkmassel marked this pull request as ready for review June 25, 2026 00:47
@jkmassel jkmassel requested a review from a team as a code owner June 25, 2026 00:47
Comment thread fastlane/lanes/plural_strings_helper.rb Outdated
next unless plural # skip non-plural catalog entries

other = plural.dig('other', 'stringUnit', 'value')
comment = body['comment']

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, this same "comment" will be used in all plural variants, right? So, the translator can only find out the meaning of the variant from the suffix like |==|plural.other. I'm not sure if that's obvious to the translators. Maybe we can manually add a "Plural category: ..." at the end of the comment?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed, but I inadvertently force-pushed it, sorry 🤦‍♂️

Comment thread fastlane/lanes/localization.rb Outdated
# GlotPress project alongside everything else.
paths_to_merge = MANUALLY_MAINTAINED_STRINGS_FILES.dup
run_plural_step('forward') do
generate_plural_strings_for_glotpress

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about removing the plural-categories.json from the git repo and calling refresh_plural_categories before generate_plural_strings_for_glotpress? If refresh_plural_categories is too slow to call locally during development, I guess we can introduce an env var or something to skip it locally. But on CI, in the real translation process, we'd want to run the refresh_plural_categories for real to get the correct supported locales?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about b4df4fa? It'll generate the category list for us as needed without the committed cache

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! I was thinking about a similar solution. It seems wasteful to build the full project for the "plural category" mapping. Instead, we can programmatically create a blank Xcode project, add the exact same local support as the app, and build that blank project. I was not aware that you can simply use a Swift package to do that, which'd be much simpler than manipulating an Xcode project.

Comment thread fastlane/lanes/localization_plurals.rb Outdated
# `WordPress/Resources` is an explicitly-referenced (non-synchronized) group, so a catalog placed
# there is NOT a target member and would be skipped by `-exportLocalizations`.
PLURALS_CATALOG = File.join(PROJECT_ROOT_FOLDER, 'WordPress', 'Classes', 'Plurals.xcstrings')
PLURALS_FLAT_STRINGS = File.join(PROJECT_ROOT_FOLDER, 'WordPress', 'Resources', 'Plurals.strings') # transient merge input (not committed)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: could this file be put into a temp dir or something?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 61f521b

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When adding strings that support plural, in addition to the Swift code, do we need to manually update this file, too?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it actually kind of goes the other way – you add strings to the catalogue then use the generated symbols in code: https://developer.apple.com/documentation/xcode/using-generated-localizable-symbols-in-your-code.

But I think that's once we're fully using the catalogues, in the interim there's more bookkeeping. We don't have any real plural strings yet, so right now this is just groundwork.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. I'm a little bit behind on this 😅 . I was only aware of the stringdicts file for plural support.

Does that mean we'll have similar files in the Swift package modules? Which means the "forward" and "reverse" processes will need to handle multiple sources of the original localizable strings?

@oguzkocer

Copy link
Copy Markdown
Contributor

Since @crazytonyli already reviewed it and infra folks will also be taking a look at it, I'll refrain from reviewing it at this time. Having said that, please let me know if my input is needed or could provide value and I'll review it as soon as I can.

@jkmassel jkmassel force-pushed the jkmassel/xcstrings-glotpress-pipeline branch from 42e2159 to 7bef459 Compare June 25, 2026 19:33
jkmassel added 2 commits June 25, 2026 13:36
Plurals are authored in WordPress/Classes/Plurals.xcstrings (English one/other)
and carried through the main app GlotPress project as flat
`<key>|==|plural.<cldr-category>` originals — the same id
`xcodebuild -exportLocalizations` uses — so every locale, including Welsh (6
forms), is covered.

Forward (no build): catalog → flat `.strings` originals for the category union,
merged into Localizable.strings (like MANUALLY_MAINTAINED_STRINGS_FILES) so they
upload with the app strings; the lane fails loudly if a plural is missing its
English `other`. Reverse (no build): the flat keys are read back out of the
downloaded Localizable.strings and folded into the catalog JSON via a committed
per-locale CLDR category map — each cell `human ?? AI ?? English`, with AI /
English-fallback cells flagged `needs_review`. The map is regenerated from
Apple's exporter by `refresh_plural_categories`, the one build-backed lane.
`.strings` reading is delegated to the release toolkit's
`read_strings_file_as_hash` (plutil); the flat keys stay in Localizable.strings
as harmless, unused-at-runtime entries.

Wired into generate_strings_file_for_glotpress / download_localized_strings: runs
in parallel with the app-strings pipeline before cutover (failsafe — a plural
error is logged, never raised), with nothing consuming the result at runtime yet.
PluralStrings is pure Ruby and unit-tested.
Replaces genstrings with Apple's `xcstringstool extract`/`sync` to generate
Localizable.xcstrings from source with no app build, keeping AppLocalizedString
and its call sites unchanged. Each extract chunk gets its own output directory
so same-basename `.stringsdata` (e.g. the two NSDate+Helpers.swift) don't
overwrite each other and silently drop strings.

`sync` leaves a key's translations untouched when its English source value
changes, so the lane reconciles them to `needs_review` afterward, walking
device/width variations as well as flat units.

Adds a "Verify String Catalog Coverage" CI step that runs genstrings over the
same files and fails if the catalog is missing any key, comparing on a
format-canonical form. The catalog is generated as an artifact, not wired into
the runtime build.
@jkmassel jkmassel force-pushed the jkmassel/xcstrings-glotpress-pipeline branch from 7bef459 to ad2a555 Compare June 25, 2026 19:36
jkmassel added 2 commits June 25, 2026 13:47
Removes the committed plural-categories.json (and refresh_plural_categories). The reverse now derives the per-locale CLDR category map fresh from Apple's exporter at fold time — from a throwaway one-plural Swift package, not the app, so it's ~4s with no app build/bootstrap and can't lag the ship-locale list or Apple's CLDR. The forward needs no map: it emits the full CLDR set (the union over any real locale set is always all six; over-emitting is harmless, the reverse folds only what each locale uses). Addresses review feedback about the committed map going stale.
`generate_plural_strings_for_glotpress` now returns the serialized `.strings`
text instead of writing `WordPress/Resources/Plurals.strings`. The merge in
`generate_strings_file_for_glotpress` owns a `Dir.mktmpdir`, writes the returned
string there, and merges from it — so the transient originals never touch the
working tree. Drops the now-unused `PLURALS_FLAT_STRINGS` constant and its
`.gitignore` entry; run standalone, the lane just prints a sample.
@jkmassel jkmassel requested a review from crazytonyli June 25, 2026 20:22
jkmassel added a commit that referenced this pull request Jun 25, 2026
…view`

Adds a `lint_swiftui_preview_strings` fastlane lane and a "Lint SwiftUI
Preview Strings" Buildkite step so the cleanup can't regress: a preview
shipping a translatable `Text("literal")` now fails CI with the exact
file:line and the fix (wrap in `Text(verbatim:)`).

Reuses Apple's extractor as its own oracle — `xcstringstool extract
--SwiftUI-Text` tags every extracted string's `visibility`, and #Preview /
PreviewProvider literals come out as `visibility: "preview"`. No source
parsing or brace-tracking, and tautologically consistent with what
extraction actually pulls. Self-contained (no dependency on the #25688
catalog lane). Green on the current tree; verified it fails on an injected
preview literal.

Documents the `verbatim:` rule and the gate in docs/localization.md.
jkmassel added a commit that referenced this pull request Jun 25, 2026
…view`

Adds a `lint_swiftui_preview_strings` fastlane lane and a "Lint SwiftUI
Preview Strings" Buildkite step so preview strings can't regress: a `#Preview`
shipping a translatable `Text("literal")` now fails CI with the exact file:line
and the fix (wrap in `Text(verbatim:)`).

Reuses Apple's extractor as its own oracle — `xcstringstool extract
--SwiftUI-Text` tags every extracted string's `visibility`, and #Preview /
PreviewProvider literals come out as `visibility: "preview"`. No source parsing
or brace-tracking, and tautologically consistent with what extraction pulls.
Self-contained (no dependency on the #25688 catalog lane). Green on the current
tree; verified it fails on an injected preview literal.

Documents the `verbatim:` rule and the gate in docs/localization.md.
jkmassel added a commit that referenced this pull request Jun 25, 2026
…view`

Adds a `lint_swiftui_preview_strings` fastlane lane and a "Lint SwiftUI
Preview Strings" Buildkite step so preview strings can't regress: a `#Preview`
shipping a translatable `Text("literal")` now fails CI with the exact file:line
and the fix (wrap in `Text(verbatim:)`).

Reuses Apple's extractor as its own oracle — `xcstringstool extract
--SwiftUI-Text` tags every extracted string's `visibility`, and #Preview /
PreviewProvider literals come out as `visibility: "preview"`. No source parsing
or brace-tracking, and tautologically consistent with what extraction pulls.
Self-contained (no dependency on the #25688 catalog lane). Green on the current
tree; verified it fails on an injected preview literal.

Documents the `verbatim:` rule and the gate in docs/localization.md.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Tooling Build, Release, and Validation Tools [Type] Enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants