Hello there, thank you for your support! Please make sure to go through the below Guideline and Requirement carefully. If you have any questions, please contact your project manager :)

Production Guideline

1. Background knowledge

  • Query translation projects contain search queries from users of one target sub-station of multilanguage web-site. For instance, queries from Aliexpress Spanish web-site by default come from Spanish-speaking users and targeting language would be Spanish.
  • The content of search queries usually are product names (maxi dress), brand names (Zara) or more broad range of products (construction tools, camping, dog grooming), the purpose of each query is to find a product.

    2. Task purpose:

Receive high quality search query translations to improve conversion rate from search result list to detail product pages.

Translator is expected to provide:

  1. Language identification result
  2. Source spelling correction result
  3. Correct translation

Only native source language speakers should participate in this project.

3.Workflow

Search Query-Master Guidelines - 图1

* If source query couldn’t be understood, please mark the “source language” as 0 even the query itself is in the language scope.
And if “source language” is 0, please clear the pre-filled “target”.

4. Explanation of the fields

Input fields:

Fields Explanation
My Sources user’s search query to translate

Output fields:

Fields Explanation
Source Language the result of language identification by the machine, need to change base on human identified result
corrected Source the result of source spelling correction
Target Pre-filled with MT result, please make correction

5. Instructions

5.1 Identifying query source language

Translator is asked to identify the source language, and input the result into the field of “Source Language:”.
Only 3 kinds of options are allowed for each language scope. For example, if we are handling the query from Russian to English:
Search Query-Master Guidelines - 图2

Special case handling and explanation
**

ID of case Case srcText examples Source Language Comments
1 All numbers 123456 English
2 Not a complete word, but just a series of syllables коф 0
3 Other languages than the assigned source languages נעליים 0
4 Brand name Samsung English
5 If a query is a mix of languages.
translator should identify the language based on user’s intention
е-мейл:
vince@alibaba.com
Russian
6 If a query is a mix of targeting language and English, and English IS a brand name, select web-site targeting language as query language type. Xiaomi рюкзак Russian
7 If a query is a mix of targeting language and English, where English part IS NOT a brand name. лепин brick Russian
8 Transliteration tufli lodochki dlia zhenshchin Russian The query is a Latinized words for туфли-лодочки для женщин (women’s pumps)
9 If a query is a proper name and it is valid both in EN and the source language, please identify as English. a. Tulum (Mexican place, valid in TR and EN)
b. cicciobello bobo(an Italian brand and product, valid in IT and EN)
c. Max Verstappen (a Dutch racing driver, valid in NL and EN)
EN

Special notes:
If current query may belong to several languages simultaneously, for instance Spanish and Portuguese, or Spanish (Spain) and Spanish (South America) then select language type according to the web-site information (srcLang) - If provided queries come from Spain web-site, select Spanish (Spain) language as current query language type.

5.2 Correcting source text

Following issues are regarded as “correctable source errors”

  • spelling,
  • spacing issues

No need to correct the “agreement”, “capitalization” and “punctuation” issues, these kind of issues have NO impact to search result.

levis\ levis\ en-US in this case, the source query “ levis\ “ need to be changed as “levis” since search results for “ levis\ “ and “levis” are different.

NOTES:

  1. To check spelling of foreign names, places, etc. please refer to authority online resources.
  2. If source text contains an abbreviations/acronyms, whose meaning is unclear without further context, mark “0” in “source language” field
  3. Cacography should be treated as ordinary spelling issues.

5.3 Correct the MT translation

What translation error types need fixing?

To answer that question one must understand how search engine works.
First of all, when a user types in a query in a language other than English, first the query is being translated via MT into English and then the translation of the query is being matched with product keywords, titles or specifics. Products which keywords, titles or specifics match the translation of the query are being displayed to the user.

Take AE as example:
Search Query-Master Guidelines - 图3

Second of all, search engine does not look in the syntax of the original query, all that it needs is semantics.
Hence, “need fix” query translation error types include mistranslation, addition, omission and untranslated, all of these referring to meaningful word translation errors. Whereas all the fluency error types (word order, word form, function words, capitalization) could be ignored, as these errors does not influence search results. You can try to input some queries with fluency errors to experience that.

NOTES:

  1. If a query represents an abnormal input where word do not semantically relate to each other, provide literal translation in complete alignment with the source, regardless grammar form of each word. In this case you can leave source grammar as it is, but correct its spelling and typography.

e.g.

  • 1 8rc escala dropships-> 1 8 rc scale dropships
  • 1 ligero lámparas incandescentes colgante-> 1 light incandescent lamps pendant
  • ovall cosméticas-> Оval cosmetic
  1. Translator should be careful with the usage of “preposition”. Search result for “iPhone 6S phone” are iPhone 6S phones, whereas search results for “for iphone 6 s phone” are all kind of accessories for iPhone 6S phone. Since search results of source query and its provided machine translation are different, the translation should be fixed by deleting added preposition “for”.

  2. RU names, brand names, etc, should be Latinized, for example, Имран Захаев should be translated as Imran Zakhaev.

  3. Omission of tautology in target translation is acceptable as well as when it is present in the target according to the source.

Ex:
Source: шокер электрический для удара током для самообороны
Target: electric shocker for self-defence
needFix: N
Explanation: Though target translation omitted для удара током (for electric impact), it does not influence search results as “electric shocker for self-defence” already covers the meaning of the omitted part.

  1. Mistranslation of prepositions and other function words refers to fluency errors that do not need fixing.

Ex:
Source: брелок на сумку (bag keychain)
Target: keychain on bag
needFix: N
Explanation: Though it is not grammatical to say “keychain on bag”, but the translation of the query leads to the same search results as “keichain for bag” or “bag keychain”, thus it does not need fixing.

  1. Omission of preposition between two nouns is acceptable and does not need fixing.

Ex:
Source: подарок маме (gift for mom)
Target: gift mom
needFix: N
Explanation: Though preposition “for” is added in target translation, search results for “gift mom” and “gift for mom” are the same, hence provided translation does not need fixing.

BUT when query starts with a preposition followed by a product item or a brand, omission of the preposition is not acceptable and should be fixed.

Ex:
Source: для Самсунга (for Samsung)
Target: Samsung
needFix: Y
Explanation: Target translation need fixing, since search results for “for Samsung” are different accessories and results for “Samsung” are mobile phones and tablets.

6. QA for the search query

6.1 Quality control process and requirement

  1. All queries will be checked by 3rd party LSPs to ensure the quality.
  2. After being checked by 3rd party LSP, 5% of the delivery file will be randomly chosen to be checked by Alibaba internal linguist, the error rate (problematic query translation/ all checked queries) should be no more than 1%.

6.2 Field explanation
Search Query-Master Guidelines - 图4
Language: - input correct language identified result
Corrected Source: - input corrected source
New Target: - input the corrected translation
Issue type:/Severity: - input relevant result when “New Target” was input

Important:
If the results of “Source Language”, “Corrected Source” and original “Target” are correct, please leave “Language”, “Corrected Source” and “New Target” as blank.

Production Method

This production will be carried out online through Alibaba Translate’s platform.

Turn-Around-Time


We require 24-hour TAT for production and 24-hour TAT for QA for business days.


Search Query-Master Guidelines - 图5