Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,15 @@ _None_

### Bug Fixes

_None_
- `StringsFileValidationHelper.find_duplicated_keys` now parses *unquoted* keys and values (valid `.strings` syntax, common in `InfoPlist.strings`) instead of raising `Invalid character`, matching the character set `plutil` accepts for unquoted strings (alphanumerics plus `_ . - $ : /`). This lets `ios_lint_localizations`' `check_duplicate_keys` work on `InfoPlist.strings`-style files. [#741]
- `StringsFileValidationHelper.find_duplicated_keys` now also accepts comments placed *between* the tokens of a statement (e.g. `"key" /* note */ = "value";`), which `plutil` allows, instead of raising `Invalid character` on the `/`. [#741]
- `ios_lint_localizations`' `check_duplicate_keys` no longer crashes the lane on a file that parses as a property list but isn't a flat `.strings` (e.g. a nested-dictionary value); it now warns via `UI.important` and skips that file. [#741]
- `L10nHelper.merge_strings` now applies the key prefix via a comment-aware tokenizer, so it correctly prefixes unquoted keys containing `. - $ : /`, lines with an *unquoted* value (e.g. `CFBundleName = WordPress;`), and keys sitting behind an inter-token `.strings` comment (e.g. `CFBundleName /* note */ = WordPress;`) — matching the grammar `plutil` accepts. Previously the line-based matcher left those keys written to the merged file without the prefix while still bookkeeping them *with* it, leaving the output inconsistent with the reported keys (which could resurface the very collisions the prefix avoids and break downstream key extraction). [#741]

### Internal Changes

_None_
- Centralized `.strings` duplicate-key detection behind `StringsFileValidationHelper.scan_for_duplicate_keys`, which detects the file format and returns a `[:scanned | :unsupported_format | :unscannable, payload]` tri-state that callers map to their own policy (warn-and-skip vs fail-closed). `ios_lint_localizations` now uses it. [#741]
- `L10nHelper.strings_file_type` (and `StringsFileValidationHelper.scan_for_duplicate_keys`) accept `assume_valid:` to skip a redundant `plutil -lint` when the caller has already parsed the file. [#741]

## 14.7.0

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,15 +56,20 @@ def self.find_duplicated_keys(params)
language = File.basename(File.dirname(file), '.lproj')
path = File.join(params[:input_dir], file)

file_type = Fastlane::Helper::Ios::L10nHelper.strings_file_type(path: path)
if file_type == :text
duplicates = Fastlane::Helper::Ios::StringsFileValidationHelper.find_duplicated_keys(file: path)
duplicate_keys[language] = duplicates.map { |key, value| "`#{key}` was found at multiple lines: #{value.join(', ')}" } unless duplicates.empty?
else
status, payload = Fastlane::Helper::Ios::StringsFileValidationHelper.scan_for_duplicate_keys(file: path)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice appraoch.

case status
when :scanned
duplicate_keys[language] = payload.map { |key, value| "`#{key}` was found at multiple lines: #{value.join(', ')}" } unless payload.empty?
when :unsupported_format
UI.important <<~WRONG_FORMAT
File `#{path}` is in #{file_type} format, while finding duplicate keys only make sense on files that are in ASCII-plist format.
Since your files are in #{file_type} format, you should probably disable the `check_duplicate_keys` option from this `#{action_name}` call.
File `#{path}` is in #{payload} format, while finding duplicate keys can only occurr on files that are in ASCII-plist format.
Since your files are in #{payload} format, you should probably disable the `check_duplicate_keys` option from this `#{action_name}` call.
WRONG_FORMAT
when :unscannable
UI.important <<~UNSCANNABLE
Could not check `#{path}` for duplicate keys: #{payload.strip}
The file parses as a property list but isn't a flat `.strings` file the duplicate-key scanner understands, so duplicate detection was skipped for it.
UNSCANNABLE
end
end

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,24 @@ class L10nHelper
# Returns the type of a `.strings` file (XML, binary or ASCII)
#
# @param [String] path The path to the `.strings` file to check
# @param [Boolean] assume_valid Skip the `plutil -lint` validity check when the caller has already
# confirmed the file parses (e.g. via `read_strings_file_as_hash`), avoiding a redundant
# `plutil` invocation. Only the format detection (`file`) then runs.
# @return [Symbol] The file format used by the `.strings` file. Can be one of:
# - `:text` for the ASCII-plist file format (containing typical `"key" = "value";` lines)
# - `:xml` for XML plist file format (can be used if machine-generated, especially since there's no official way/tool to generate the ASCII-plist file format as output)
# - `:binary` for binary plist file format (usually only true for `.strings` files converted by Xcode at compile time and included in the final `.app`/`.ipa`)
# - `nil` if the file does not exist or is neither of those format (e.g. not a `.strings` file at all)
#
def self.strings_file_type(path:)
def self.strings_file_type(path:, assume_valid: false)
return :text if File.empty?(path) # If completely empty file, consider it as a valid `.strings` files in textual format

# Start by checking it seems like a valid property-list file (and not e.g. an image or plain text file)
_, status = Open3.capture2('/usr/bin/plutil', '-lint', path)
return nil unless status.success?
# Start by checking it seems like a valid property-list file (and not e.g. an image or plain text file).
# A caller that has already parsed the file can skip this redundant check via `assume_valid: true`.
unless assume_valid
_, status = Open3.capture2('/usr/bin/plutil', '-lint', path)
return nil unless status.success?
end

# If it is a valid property-list file, determine the actual format used
format_desc, status = Open3.capture2('/usr/bin/file', path)
Expand Down Expand Up @@ -93,15 +99,13 @@ def self.merge_strings(paths:, output_path:)
all_keys_found += string_keys

tmp_file.write("/* MARK: - #{File.basename(input_file)} */\n\n")
# Read line-by-line to reduce memory footprint during content copy
read_utf8_lines(input_file).each do |line|
unless prefix.nil? || prefix.empty?
# The `/u` modifier on the RegExps is to make them UTF-8
line.gsub!(/^(\s*")/u, "\\1#{prefix}") # Lines starting with a quote are considered to be start of a key; add prefix right after the quote
line.gsub!(/^(\s*)([A-Z0-9_]+)(\s*=\s*")/ui, "\\1\"#{prefix}\\2\"\\3") # Lines starting with an identifier followed by a '=' are considered to be an unquoted key (typical in InfoPlist.strings files for example)
end
tmp_file.write(line)
end
# Add the prefix to every key. We tokenize via `StringsFileValidationHelper.prefix_keys` rather than
# matching keys with a line-based regex, so that keys are found regardless of where `.strings` comments
# sit (e.g. `CFBundleName /* note */ = WordPress;`) and `key = value`-looking text inside a comment is
# left alone — keeping the written keys consistent with the (`plutil`-derived) keys bookkept above.
lines = read_utf8_lines(input_file)
lines = Fastlane::Helper::Ios::StringsFileValidationHelper.prefix_keys(lines: lines, prefix: prefix)
lines.each { |line| tmp_file.write(line) }
tmp_file.write("\n")
end
tmp_file.close # ensure we flush the content to disk
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,32 +5,83 @@ module Helper
module Ios
class StringsFileValidationHelper
# context can be one of:
# :root, :maybe_comment_start, :in_line_comment, :in_block_comment,
# :maybe_block_comment_end, :in_quoted_key,
# :after_quoted_key_before_eq, :after_quoted_key_and_equal,
# :in_quoted_value, :after_quoted_value
State = Struct.new(:context, :buffer, :in_escaped_ctx, :found_key)
# :root, :maybe_comment_start, :maybe_comment_or_value, :in_line_comment, :in_block_comment,
# :maybe_block_comment_end, :in_quoted_key, :in_unquoted_key,
# :after_quoted_key_before_eq, :after_quoted_key_and_eq,
# :in_quoted_value, :in_unquoted_value, :after_quoted_value
#
# `resume_context` holds the context to return to once a comment ends. Comments are valid not only at
# the top level but also *between* the tokens of a statement (e.g. `"key" /* note */ = "value";`), so a
# comment must resume the state it interrupted rather than always dropping back to `:root`.
State = Struct.new(:context, :buffer, :in_escaped_ctx, :found_key, :resume_context)

# Characters allowed in an *unquoted* string — a key or a value. Unquoted strings are valid
# `.strings` syntax (the old-style ASCII property-list format) and are common in `InfoPlist.strings`
# (e.g. `CFBundleName = WordPress;`). `plutil` accepts alphanumerics plus `_ . - $ : /` in an
# unquoted string, so we match the same set: this scanner only ever runs on input `plutil` has
# already accepted, and matching its grammar keeps a file it parses from tripping the scanner.
# An unquoted key runs until the first whitespace or `=`; an unquoted value until whitespace or `;`.
UNQUOTED_STRING_CHARACTER = %r{[a-zA-Z0-9_.$:/-]}u

# Enter a comment from an inter-token position, remembering where to resume once it ends.
# (`state.context` is still the originating state when a transition lambda runs — it's only reassigned
# to the lambda's return value afterwards — so this captures the state the comment interrupts.)
ENTER_COMMENT = lambda do |state, _c|
state.resume_context = state.context
:maybe_comment_start
end

# A `/` between `=` and the value is ambiguous: it can start a comment (`/* … */` or `// …`) or be the
# first character of an unquoted value (e.g. a path like `/usr/bin`). Defer the decision by one character
# via `:maybe_comment_or_value`.
ENTER_COMMENT_OR_VALUE = lambda do |state, _c|
state.resume_context = state.context
:maybe_comment_or_value
end

# Restore the context a comment interrupted (defaults to `:root`), then clear the saved context.
RESUME_AFTER_COMMENT = lambda do |state, _c|
resume = state.resume_context || :root
state.resume_context = :root
resume
end

TRANSITIONS = {
root: {
/\s/u => :root,
'/' => :maybe_comment_start,
'"' => :in_quoted_key
'"' => :in_quoted_key,
# An unquoted key, e.g. `CFBundleName = "…";` as used by `InfoPlist.strings`.
UNQUOTED_STRING_CHARACTER => lambda do |state, c|
state.buffer.write(c)
:in_unquoted_key
end
},
maybe_comment_start: {
'/' => :in_line_comment,
/\*/u => :in_block_comment
},
# Reached only from `:after_quoted_key_and_eq`, where a leading `/` might begin a comment or an
# unquoted value. A `*` or `/` confirms a comment; anything else means the `/` was the value's first
# character (values aren't buffered, so we just continue scanning the value).
maybe_comment_or_value: {
/\*/u => :in_block_comment,
'/' => :in_line_comment,
/./mu => lambda do |state, _c|
state.resume_context = :root
:in_unquoted_value
end
},
in_line_comment: {
"\n" => :root,
"\n" => RESUME_AFTER_COMMENT,
/./u => :in_line_comment
},
in_block_comment: {
/\*/ => :maybe_block_comment_end,
/./mu => :in_block_comment
},
maybe_block_comment_end: {
'/' => :root,
'/' => RESUME_AFTER_COMMENT,
/./mu => :in_block_comment
},
in_quoted_key: {
Expand All @@ -44,19 +95,48 @@ class StringsFileValidationHelper
:in_quoted_key
end
},
in_unquoted_key: {
# The key ends at the first whitespace or `=`. Whitespace still expects an `=` next;
# an `=` moves straight on to the value.
/[\s=]/u => lambda do |state, c|
state.found_key = state.buffer.string.dup
state.buffer = StringIO.new
c == '=' ? :after_quoted_key_and_eq : :after_quoted_key_before_eq
end,
UNQUOTED_STRING_CHARACTER => lambda do |state, c|
state.buffer.write(c)
:in_unquoted_key
end
},
after_quoted_key_before_eq: {
# A comment may sit between the key and the `=` (e.g. `"key" /* note */ = "value";`).
'/' => ENTER_COMMENT,
/\s/u => :after_quoted_key_before_eq,
'=' => :after_quoted_key_and_eq
},
after_quoted_key_and_eq: {
# A `/` here may start a comment or an unquoted value (which can contain `/`); disambiguate one char
# later. This entry must precede `UNQUOTED_STRING_CHARACTER` below, which also matches `/`.
'/' => ENTER_COMMENT_OR_VALUE,
/\s/u => :after_quoted_key_and_eq,
'"' => :in_quoted_value
'"' => :in_quoted_value,
# An unquoted value, e.g. `CFBundleName = WordPress;` as used by `InfoPlist.strings`.
UNQUOTED_STRING_CHARACTER => :in_unquoted_value
},
in_quoted_value: {
'"' => :after_quoted_value,
/./mu => :in_quoted_value
},
in_unquoted_value: {
# The value ends at the first whitespace or the terminating `;`. Its contents are irrelevant
# to duplicate-key detection, so — unlike a key — we don't buffer it.
';' => :root,
/\s/u => :after_quoted_value,
UNQUOTED_STRING_CHARACTER => :in_unquoted_value
},
after_quoted_value: {
# A comment may sit between the value and the terminating `;` (e.g. `"key" = "value" /* note */;`).
'/' => ENTER_COMMENT,
/\s/u => :after_quoted_value,
';' => :root
}
Expand All @@ -70,7 +150,7 @@ class StringsFileValidationHelper
def self.find_duplicated_keys(file:)
keys_with_lines = Hash.new { |h, k| h[k] = [] }

state = State.new(context: :root, buffer: StringIO.new, in_escaped_ctx: false, found_key: nil)
state = State.new(context: :root, buffer: StringIO.new, in_escaped_ctx: false, found_key: nil, resume_context: :root)

# Using our `each_utf8_line` helper instead of `File.readlines` ensures we can also read files that are
# encoded in UTF-16, yet process each of their lines as a UTF-8 string, so that `RegExp#match?` don't throw
Expand Down Expand Up @@ -106,6 +186,73 @@ def self.find_duplicated_keys(file:)

keys_with_lines.keep_if { |_, lines| lines.count > 1 }
end

# Detects the file format and, when applicable, scans for duplicate keys — in one step, so
# callers don't each re-implement the "`:text`-only" gate. `find_duplicated_keys` only
# understands the flat ASCII-plist syntax; an xml/binary plist can't be tokenized by it (though
# `plutil` collapses any duplicate to its last value when parsing those anyway).
#
# @param [String] file The path to the `.strings` file to inspect.
# @param [Boolean] assume_valid Forwarded to `strings_file_type`: skip the redundant `plutil -lint`
# when the caller has already confirmed the file parses.
# @return [Array] A `[status, payload]` pair, one of:
# - `[:scanned, { key => [lines] }]` — a `:text` file we tokenized (hash empty if none).
# - `[:unsupported_format, format]` — not a `:text` file (`:xml`, `:binary`, or `nil`); not scanned.
# - `[:unscannable, error_message]` — a `:text` file the tokenizer couldn't read.
# Each caller decides how to react (warn-and-skip, fail closed, …) from this one source of truth.
def self.scan_for_duplicate_keys(file:, assume_valid: false)
format = Fastlane::Helper::Ios::L10nHelper.strings_file_type(path: file, assume_valid: assume_valid)
return [:unsupported_format, format] unless format == :text

[:scanned, find_duplicated_keys(file: file)]
rescue StandardError => e
[:unscannable, e.message]
end

# Rewrites `.strings` lines so every key carries `prefix`, leaving comments, values, whitespace and
# formatting untouched. A quoted key gets the prefix inside its quotes (`"key"` → `"<prefix>key"`); an
# unquoted key is wrapped in quotes (`key` → `"<prefix>key"`). Because it tokenizes the file the same way
# `find_duplicated_keys` does, it is comment-aware: a key sitting behind an inter-token comment (e.g.
# `key /* note */ = value;`) is still prefixed, and `key = value`-looking text *inside* a comment is left
# alone — a distinction a line-based regex can't reliably make.
#
# @param [Array<String>] lines The file's lines, already decoded to UTF-8 (e.g. via `L10nHelper.read_utf8_lines`).
# @param [String] prefix The prefix to insert before every key. A nil/empty prefix returns `lines` unchanged.
# @return [Array<String>] The rewritten lines.
def self.prefix_keys(lines:, prefix:)
return lines if prefix.nil? || prefix.empty?

state = State.new(context: :root, buffer: StringIO.new, in_escaped_ctx: false, found_key: nil, resume_context: :root)
lines.map do |line|
rewritten = +''
line.each_char do |c|
# Escaped characters only occur inside quoted strings or comments — never around a key boundary —
# so they're copied through verbatim (mirroring `find_duplicated_keys`' global escape handling).
if state.in_escaped_ctx || c == '\\'
state.in_escaped_ctx = !state.in_escaped_ctx
rewritten << c
next
end

previous_context = state.context
(_, transition) = TRANSITIONS[previous_context].find { |regex, _| c.match?(regex) } || [nil, nil]
raise "Invalid character `#{c}` found (current context: #{previous_context})" if transition.nil?

state.context = transition.is_a?(Proc) ? transition.call(state, c) : transition

if previous_context == :root && state.context == :in_quoted_key
rewritten << c << prefix # opening `"` of a quoted key — the prefix goes inside the quotes
elsif previous_context == :root && state.context == :in_unquoted_key
rewritten << '"' << prefix << c # first char of an unquoted key — open a quote + prefix, then the char
elsif previous_context == :in_unquoted_key && state.context != :in_unquoted_key
rewritten << '"' << c # the unquoted key just ended — close the quote, then the delimiter
else
rewritten << c
end
end
rewritten
end
end
end
end
end
Expand Down
Loading