Resource files

This section explains in general how LingoHub imports and exports the specific resource files. Specific documentation for every supported resource file type can be found in the child pages.
They explain the options LingoHub provides to manage each type of resource file.

Parsing the file name

LingoHub first tries to understand the filename of the uploaded file. The following three file name formats can be used:

  1. <basename><locale-separator><locale>.<extension> (e.g., “activerecord_en-US.yml”, “public.de_AT.xml”)
  2. <locale>.<extension> (e.g., “en.yml”, “de-AT.strings”)
  3. <basename>.<extension> (e.g., “Localizable.strings”)

LingoHub will then try to extract the <basename> and the <locale> of this filename.

For 1) it will try to find the <locale> information. If this is preceded by a <locale-separator> (which can be either “.” or “_”), the part before this locale separator is interpreted as the basename.

For 2) it will detect that there is no <basename> information, just the <locale> information.

For 3) LingoHub will not find any <locale> information and therefore will not know to which language the imported translations must be assigned. So the import process will stop here and you will be asked to give the language information in the “Resource Imports” view.
There is one exception to this rule. Some resource file formats like Rails I18n yaml, Xliff, and CSV do have the locale information in the file content and LingoHub will extract the <locale> from the file content.

When using LingoHub SCM integration

We recommend to use our SCM integration (GitHub, BitBucket) to synchronize your resource files in your repository with your LingoHub project.

If you choose to do so, LingoHub has the full path information of your resource files. If LingoHub is unable to extract the locale information from the filename, it will try to find the locale information in a path segment, e.g.:

  • resources/en/strings.xml
  • root/en-US/Localizable.strings
  • root/en.lproj/Localizable.strings

Locale information

The <locale> information in the filename has to be compliant with ISO 639-1.
As support for our customers we interpret “_” as an alternative of the region separator. You can either use “en-US” or “en_US”.

The list of supported locales can be found here.

Parsing the file content

Determining the character encoding

For most resource file types the character encoding of the files is not specified, sometimes just as best practice. If a file is imported to LingoHub, LingoHub must determine the character encoding of the file.
This is done by trying to parse the file using several encodings. If parsing the file succeeds LingoHub can be sure that the correct encoding was applied. If LingoHub is unable to detect the charset, the import will fail. This usually means that the file is corrupt in some way.

When exporting a file, LingoHub will use the most common character encoding for the given resource file type. This can be overridden in the project settings in the Import/Export tab.

Extracting segments

After LingoHub found the locale information, it will parse the content of the file and will extract the title of the translations as well as the content. How the content is imported can be customized in the Import/Export tab of the project settings.

The title extraction is different for every file format. Most of the time it is clear how to extract the title, e.g., for hierarchical formats like yaml or Xliff the key will be concatenated (chained together) to result in a unique key.

If the import setting ‘Text segment keys should be unique per’ is set to “Project“, the title has to be unique throughout the whole LingoHub project. So if one title is present in one resource file it must not be present in another file with a different basename. An example can be found below:

  • You upload the file “public.en.properties” containing the title “welcome”.
  • You upload the file “general.en.properties” containing the title “welcome”.
  • LingoHub assumes that the key has been moved from the file “public” to the file “general” and updates its database accordingly.
  • If you download the file “general.en.properties” the segment “welcome” will be present, but it will not be present in “public.en.properties” anymore.

If the import setting ‘Text segment keys should be unique per’ is set to “Resource File“, the title has to be unique only in a resource file. With this option, it is possible to have more than one segment with the same title in a LingoHub project. An example can be found below:

  • You upload the file “public.en.properties” containing the title “welcome”.
  • You upload the file “general.en.properties” containing the title “welcome”.
  • Both resource files “public” and “general” contain a segment with the title “welcome”. LingoHub handles them as two different segments that might have different content.

If you want to learn more about text segment key uniqueness, be sure to read to read our guide.

Segments can also be deactivated. An example can be found below:

  • You upload the file “public.en.properties” containing the title “welcome”.
  • You upload the file “public.en.properties” again, but now the title “welcome” is not defined in the file anymore.
  • LingoHub assumes that the segment “welcome” was deleted during your development process.
  • LingoHub will deactivate this segment.
  • The translators will not see it any longer in the editor (therefore they will not translate a segment that is obsolete).
  • If you export this file again the segment will not be present.
  • Importing a file that holds the title, will reactivate the segment with all its history again. Any prior translation effort will also be restored.

By default, the deactivation of a segment is only triggered by uploading a file in the source language. This can be customized by changing the import settings in the Import/Export tab.

Handling of content

In every resource file format, the content is escaped and quoted, following the specification of the format, or because the used character encoding does not allow the used characters.
While importing, LingoHub will apply these rules to the read file content to create human-readable strings.

LingoHub performs this transformation to show the translators content like “A text with “Ümlauts” & ελληνικά characters” instead of “A text with \”&#x00DC;mlauts\” &amp; &#x03B5;&#x03BB;&#x03BB;&#x03B7;&#x03BD;&#x03B9;&#x03BA;&#x03AC; characters”.
The translators will not be able to read the latter text or enter the correct escaping for their translated texts. Therefore, LingoHub performs this transformation.

This also means the originally used escaping of the uploaded file might be lost. If you import a file to LingoHub and export it again, it might differ because of alternatives used for escaping. The exported files will still always be syntactically correct according to the specifications of the resource file format.
Example: The uploaded file contained &#x00DC; for the character “Ü” – LingoHub will export it using &Uuml;. The escaping changed, but the result will still be a correct resource file.

Handling of comments

If the resource file specification supports comments, LingoHub will additionally import these comments. Some resource file types (like XLIFF, resx/w …) have a clear definition of how to specify comments for a segment. For other, mostly line-based, formats (like properties, iOS strings …) LingoHub takes the following approach:

  • The first comment in the file will be interpreted as a header comment. This will be stored along with the file metadata to export it again.
  • All following comments will be associated with the next segment and will be shown as information for the translators in our editor.

Comments may also contain LingoChecks for each segment.