Localization: String Catalog pipeline (plurals + catalog generation)#25688
Localization: String Catalog pipeline (plurals + catalog generation)#25688jkmassel wants to merge 4 commits into
Conversation
Generated by 🚫 Danger |
|
| App Name | WordPress | |
| Configuration | Release-Alpha | |
| Build Number | 32833 | |
| Version | PR #25688 | |
| Bundle ID | org.wordpress.alpha | |
| Commit | 61f521b | |
| Installation URL | 7f9cjfiqfuo2o |
|
| App Name | Jetpack | |
| Configuration | Release-Alpha | |
| Build Number | 32833 | |
| Version | PR #25688 | |
| Bundle ID | com.jetpack.alpha | |
| Commit | 61f521b | |
| Installation URL | 622jp38sj26b0 |
1518c74 to
42e2159
Compare
| next unless plural # skip non-plural catalog entries | ||
|
|
||
| other = plural.dig('other', 'stringUnit', 'value') | ||
| comment = body['comment'] |
There was a problem hiding this comment.
If I understand it correctly, this same "comment" will be used in all plural variants, right? So, the translator can only find out the meaning of the variant from the suffix like |==|plural.other. I'm not sure if that's obvious to the translators. Maybe we can manually add a "Plural category: ..." at the end of the comment?
There was a problem hiding this comment.
Addressed, but I inadvertently force-pushed it, sorry 🤦♂️
| # GlotPress project alongside everything else. | ||
| paths_to_merge = MANUALLY_MAINTAINED_STRINGS_FILES.dup | ||
| run_plural_step('forward') do | ||
| generate_plural_strings_for_glotpress |
There was a problem hiding this comment.
What do you think about removing the plural-categories.json from the git repo and calling refresh_plural_categories before generate_plural_strings_for_glotpress? If refresh_plural_categories is too slow to call locally during development, I guess we can introduce an env var or something to skip it locally. But on CI, in the real translation process, we'd want to run the refresh_plural_categories for real to get the correct supported locales?
There was a problem hiding this comment.
WDYT about b4df4fa? It'll generate the category list for us as needed without the committed cache
There was a problem hiding this comment.
Neat! I was thinking about a similar solution. It seems wasteful to build the full project for the "plural category" mapping. Instead, we can programmatically create a blank Xcode project, add the exact same local support as the app, and build that blank project. I was not aware that you can simply use a Swift package to do that, which'd be much simpler than manipulating an Xcode project.
| # `WordPress/Resources` is an explicitly-referenced (non-synchronized) group, so a catalog placed | ||
| # there is NOT a target member and would be skipped by `-exportLocalizations`. | ||
| PLURALS_CATALOG = File.join(PROJECT_ROOT_FOLDER, 'WordPress', 'Classes', 'Plurals.xcstrings') | ||
| PLURALS_FLAT_STRINGS = File.join(PROJECT_ROOT_FOLDER, 'WordPress', 'Resources', 'Plurals.strings') # transient merge input (not committed) |
There was a problem hiding this comment.
Nitpick: could this file be put into a temp dir or something?
There was a problem hiding this comment.
When adding strings that support plural, in addition to the Swift code, do we need to manually update this file, too?
There was a problem hiding this comment.
Yeah, it actually kind of goes the other way – you add strings to the catalogue then use the generated symbols in code: https://developer.apple.com/documentation/xcode/using-generated-localizable-symbols-in-your-code.
But I think that's once we're fully using the catalogues, in the interim there's more bookkeeping. We don't have any real plural strings yet, so right now this is just groundwork.
There was a problem hiding this comment.
Gotcha. I'm a little bit behind on this 😅 . I was only aware of the stringdicts file for plural support.
Does that mean we'll have similar files in the Swift package modules? Which means the "forward" and "reverse" processes will need to handle multiple sources of the original localizable strings?
|
Since @crazytonyli already reviewed it and infra folks will also be taking a look at it, I'll refrain from reviewing it at this time. Having said that, please let me know if my input is needed or could provide value and I'll review it as soon as I can. |
42e2159 to
7bef459
Compare
Plurals are authored in WordPress/Classes/Plurals.xcstrings (English one/other) and carried through the main app GlotPress project as flat `<key>|==|plural.<cldr-category>` originals — the same id `xcodebuild -exportLocalizations` uses — so every locale, including Welsh (6 forms), is covered. Forward (no build): catalog → flat `.strings` originals for the category union, merged into Localizable.strings (like MANUALLY_MAINTAINED_STRINGS_FILES) so they upload with the app strings; the lane fails loudly if a plural is missing its English `other`. Reverse (no build): the flat keys are read back out of the downloaded Localizable.strings and folded into the catalog JSON via a committed per-locale CLDR category map — each cell `human ?? AI ?? English`, with AI / English-fallback cells flagged `needs_review`. The map is regenerated from Apple's exporter by `refresh_plural_categories`, the one build-backed lane. `.strings` reading is delegated to the release toolkit's `read_strings_file_as_hash` (plutil); the flat keys stay in Localizable.strings as harmless, unused-at-runtime entries. Wired into generate_strings_file_for_glotpress / download_localized_strings: runs in parallel with the app-strings pipeline before cutover (failsafe — a plural error is logged, never raised), with nothing consuming the result at runtime yet. PluralStrings is pure Ruby and unit-tested.
Replaces genstrings with Apple's `xcstringstool extract`/`sync` to generate Localizable.xcstrings from source with no app build, keeping AppLocalizedString and its call sites unchanged. Each extract chunk gets its own output directory so same-basename `.stringsdata` (e.g. the two NSDate+Helpers.swift) don't overwrite each other and silently drop strings. `sync` leaves a key's translations untouched when its English source value changes, so the lane reconciles them to `needs_review` afterward, walking device/width variations as well as flat units. Adds a "Verify String Catalog Coverage" CI step that runs genstrings over the same files and fails if the catalog is missing any key, comparing on a format-canonical form. The catalog is generated as an artifact, not wired into the runtime build.
7bef459 to
ad2a555
Compare
Removes the committed plural-categories.json (and refresh_plural_categories). The reverse now derives the per-locale CLDR category map fresh from Apple's exporter at fold time — from a throwaway one-plural Swift package, not the app, so it's ~4s with no app build/bootstrap and can't lag the ship-locale list or Apple's CLDR. The forward needs no map: it emits the full CLDR set (the union over any real locale set is always all six; over-emitting is harmless, the reverse folds only what each locale uses). Addresses review feedback about the committed map going stale.
`generate_plural_strings_for_glotpress` now returns the serialized `.strings` text instead of writing `WordPress/Resources/Plurals.strings`. The merge in `generate_strings_file_for_glotpress` owns a `Dir.mktmpdir`, writes the returned string there, and merges from it — so the transient originals never touch the working tree. Drops the now-unused `PLURALS_FLAT_STRINGS` constant and its `.gitignore` entry; run standalone, the lane just prints a sample.
…view`
Adds a `lint_swiftui_preview_strings` fastlane lane and a "Lint SwiftUI
Preview Strings" Buildkite step so the cleanup can't regress: a preview
shipping a translatable `Text("literal")` now fails CI with the exact
file:line and the fix (wrap in `Text(verbatim:)`).
Reuses Apple's extractor as its own oracle — `xcstringstool extract
--SwiftUI-Text` tags every extracted string's `visibility`, and #Preview /
PreviewProvider literals come out as `visibility: "preview"`. No source
parsing or brace-tracking, and tautologically consistent with what
extraction actually pulls. Self-contained (no dependency on the #25688
catalog lane). Green on the current tree; verified it fails on an injected
preview literal.
Documents the `verbatim:` rule and the gate in docs/localization.md.
…view`
Adds a `lint_swiftui_preview_strings` fastlane lane and a "Lint SwiftUI
Preview Strings" Buildkite step so preview strings can't regress: a `#Preview`
shipping a translatable `Text("literal")` now fails CI with the exact file:line
and the fix (wrap in `Text(verbatim:)`).
Reuses Apple's extractor as its own oracle — `xcstringstool extract
--SwiftUI-Text` tags every extracted string's `visibility`, and #Preview /
PreviewProvider literals come out as `visibility: "preview"`. No source parsing
or brace-tracking, and tautologically consistent with what extraction pulls.
Self-contained (no dependency on the #25688 catalog lane). Green on the current
tree; verified it fails on an injected preview literal.
Documents the `verbatim:` rule and the gate in docs/localization.md.
…view`
Adds a `lint_swiftui_preview_strings` fastlane lane and a "Lint SwiftUI
Preview Strings" Buildkite step so preview strings can't regress: a `#Preview`
shipping a translatable `Text("literal")` now fails CI with the exact file:line
and the fix (wrap in `Text(verbatim:)`).
Reuses Apple's extractor as its own oracle — `xcstringstool extract
--SwiftUI-Text` tags every extracted string's `visibility`, and #Preview /
PreviewProvider literals come out as `visibility: "preview"`. No source parsing
or brace-tracking, and tautologically consistent with what extraction pulls.
Self-contained (no dependency on the #25688 catalog lane). Green on the current
tree; verified it fails on an injected preview literal.
Documents the `verbatim:` rule and the gate in docs/localization.md.


Moves WordPress-iOS localization toward String Catalogs, in two stages — plurals through GlotPress, and a build-free
genstringsreplacement. The plural round-trip is wired into the release flow (failsafe), running alongside the app-strings pipeline; catalog generation runs as a build-free CI coverage gate. Nothing is read at runtime yet; the cutover is a separate migration.Summary
.stringsfile can't represent CLDR plural categories; a String Catalog can. Plurals ride the main app GlotPress project (no separate project), and the round-trip needs no app build.genstringswith Apple'sxcstringstool, keepingAppLocalizedStringand its ~379 call sites unchanged, with a CI gate proving the catalog loses no stringgenstringsfinds.Stage 1 — Plurals
fastlane/lanes/localization_plurals.rb,plural_strings_helper.rb,WordPress/Classes/Plurals.xcstringsPlurals are authored in
Plurals.xcstrings(Englishone/other) and carried through the main app GlotPress project as flat strings keyed<key>|==|plural.<cldr-category>— the same id Apple'sxcodebuild -exportLocalizationsuses — so no plural-rules map is needed and every locale is covered, including Welsh (6 forms), which the gettext/.stringsdictpath can't.Localizable.strings(likeMANUALLY_MAINTAINED_STRINGS_FILES) so they upload with the app strings. One original per CLDR category — the full set (zero/one/two/few/many/other), unconditionally; English fills the categories it doesn't itself distinguish. No locale list is read here: the union of categories over any real locale set is always all six (Arabic/Welsh use them all), and over-emitting is harmless — the reverse folds only the categories each locale actually needs. Fails loudly if a plural is missing its Englishother.Localizable.stringsand folded straight into the catalog JSON. Each cell ishuman ?? AI ?? English source; AI / English-fallback cells are flaggedneeds_review. The flat keys stay inLocalizable.stringsas harmless, unused-at-runtime entries — like the mergedinfoplist.*keys.ja→other;pl→one/few/many/other). The reverse derives that map fresh at fold time from Apple's exporter — but run against a throwaway one-plural Swift package, not the app, so it's a few-secondxcodebuild -exportLocalizationsover an SPM stub with no app build or bootstrap. The categories are a property of the locale's CLDR, not of our strings, so the stub yields the same per-locale sets as exporting the whole app. Nothing is committed and nothing is cached, so the map can't lag the ship-locale list (33 today) or Apple's CLDR. (Apple's shipped CLDR giveses/fr/it/pt=one, other, nomany— exactly why this isn't hand-authored.).stringsreading is delegated to the release toolkit'sread_strings_file_as_hash(plutil) — no hand-rolled parsing.PluralStringsis pure Ruby and unit-tested. The AI tier is a stub today (nil→ English fallback); the hook andneeds_reviewflagging are in place for when it's wired.The two entries in
Plurals.xcstrings(blogging.reminders.weeklyCount,editor.textCounter.wordCount) are seed fixtures to exercise the round-trip end-to-end — they aren't referenced by any call site yet (the legacy formatters still do their own count-based logic), so real plurals get wired (or these seeds replaced) at the cutover.Stage 2 — Catalog generation
fastlane/lanes/localization_catalog.rb,catalog_helper.rbgenerate_strings_catalog:xcstringstool extract --legacy-localizable-strings --modern-localizable-strings -s AppLocalizedString→syncintoLocalizable.xcstrings. No app build.-s AppLocalizedStringis the same flaggenstringsuses today, so the custom routine and its ~379 call sites are untouched. Each extract chunk gets its own output dir, so same-basename.stringsdatadon't overwrite each other.xcstringstool syncdoesn't reconcile a key whose English source value changed; the lane re-derives current English from a fresh extraction and flips changed keys' translations toneeds_review(walking device/width variations, not just flat units). This is staged for cutover — with the catalog gitignored and regenerated fresh each run, there are no persisted translations to re-flag yet.verify_strings_catalog, CI): runsgenstringsover the same source and fails on any key the catalog is missing — proving the build-free extraction loses nothing. Both sides read viaread_strings_file_as_hash.Testing instructions
All checks below are build-free — no full
rake dependencies, just the gems (bundle install) and Xcode (forxcstringstool/genstrings). No app build, SPM resolve, Gutenberg clone, or secrets. Step 1 writes only a gitignored artifact and step 3 writes nothing, sogit statusstays clean.1 — Build-free catalog extraction (Stage 2). Generate the catalog from source:
bundle exec fastlane ios generate_strings_catalog✅
Generated Localizable.xcstrings with 3923 keys (…), ~1 minute. Thegenstringsreplacement with no app build; writes the (gitignored)WordPress/Resources/Localizable.xcstrings.2 — Coverage gate (Stage 2). Prove the build-free extraction loses nothing vs.
genstrings:bundle exec fastlane ios verify_strings_catalog✅
Localizable.xcstrings covers all 3923 genstrings keys.(otherwise it fails non-zero and lists the missing keys — this is the gate CI runs).To prove the gate isn't vacuous, delete any entry from the just-generated
Localizable.xcstringsand re-run — it exits 1 naming the gap (MISSING from catalog: "…"). Re-running step 1 regenerates the key and the gate passes again (the catalog is a throwaway artifact).3 — Plural forward (Stage 1). Expand the seed plurals into the GlotPress flat originals:
bundle exec fastlane ios generate_plural_strings_for_glotpress✅
Generated 12 flat plural originals from 2 catalog keys.followed by a printedSample:. The lane writes nothing — it returns the originals as a string; at code freezegenerate_strings_file_for_glotpresswrites them to a temp file and merges them intoLocalizable.strings. Each seed plural expands to one"<key>|==|plural.<category>"entry per CLDR category — the full set (zero/one/two/few/many/other), English filling the categories it doesn't itself distinguish.Not covered above (and why): the plural reverse fold (
download_localized_plurals) needs a GlotPress round-trip, so it's exercised by the unit suite and runs at code freeze. Its per-locale CLDR map is derived inline from a throwaway one-plural Swift package — a few-secondxcodebuild -exportLocalizationsover an SPM stub, no app workspace or bootstrap — so there's nothing committed to verify here.Status
genstrings(steps 1–2)PluralStrings/CatalogHelperunit suites pass (local harness, pure Ruby); lanes load; rubocop cleangenerate_strings_file_for_glotpress/download_localized_strings, failsafe (logged, never raised)verify-strings-catalog.sh) — hard-fails on any gapLocalizable.stringsNotes
Plurals.xcstringslives in a synchronized folder (WordPress/Classes/) so it auto-joins the target (compiles to.stringsdict) and is visible to-exportLocalizations;WordPress/Resources/is explicitly-referenced (non-synced).-exportLocalizationsis a few-second SPM export with no Mac Catalyst build, sidestepping the binary-dependency slice problems a full-app export hits (e.g. the Zendesk xcframework's missingmaccatalystslice).Not in this PR (deliberate)
Localizable.strings, the catalog as the committed backing store, unifyingPlurals.xcstringsinto the one catalog.human ?? AI ?? English), but the service is a stub.Localizable.xcstrings,Plurals.strings) — CI-generated, gitignored, not committed.gutenberg_path:param) and other-table catalogs.