ZOJ Problem Set - 4022
An honorific is a title that conveys esteem or respect for position or rank when used in addressing or referring to a person. Sometimes, the term "honorific" is used in a more specific sense to refer to an honorary academic title. It is also often conflated with systems of honorific speech in linguistics, which are grammatical or morphological ways of encoding the relative social status of speakers.
In Japenese, "-san" (sometimes pronounced as "han" in Kansai dialect) is the most commonplace honorific and is a title of respect typically used between equals of any age. Although the closest analogs in English are the honorifics "Mr.", "Miss", "Ms.", or "Mrs.", "-san" is almost universally added to a person's name; "-san" can be used in formal and informal contexts and for any gender. Because it is the most common honorific, it is also the most often used to convert common nouns into proper ones.
In Korean, "-ssi" is the most commonly used honorific used amongst people of approximately equal speech level. It is attached at the end of the full name, such as Gim Cheolsu-ssi or simply after the first name, Cheolsu-ssi, if the speaker is more familiar with someone. Appending "-ssi" to the surname, for instance Gim-ssi, can be quite rude, as it indicates that the speaker considers himself to be of a higher social status than the person he is speaking to.
In this problem, you are given a series of Japanese or Korean names, please add an honorific to each name. You should append "-san" to each Japanese name, and append "-ssi" to each Korean name.
In the input file, Japanese names are given in Kunrei-shiki romanization system, and Korean names are given in revised romanization system. Each name consists of two capitalized words composed of Latin alphabet. The first word is the surname (family name), and the second word is the forename (given name).
The test data contains 10000 randomly generated names. train.txt (can be downloaded below) is the training data for you, and its format is described in the input section. The training data only has 6000 names covering 60% ~ 70% surnames and forenames. Please pay attention to the generalization of your solution.
Since there may be some rare and very confused cases even for a human, your solution only needs to achieve at least 99% correct rate on the test.
For the formal test, there are multiple test cases. The first line of input contains an integer \(T\) (\(T = 10000\)), indicating the number of test cases. For each test case:
The first line contains two words (the first character of each word is capitalized), which is either a Japanese name or a Korean name. The length of each word will not exceed 20.
For train.txt, the first line contains an integer \(T\) (\(T = 6000\)), indicating the number of training data. For each training data:
The first line contains two words (the first character of each word is capitalized), which is either a Japanese name or a Korean name followed by the correct honorific of this data. The length of each word (without the honorific) will not exceed 20.
For each test case, output the name and the honorific. You should append "-san" to each Japanese name, and append "-ssi" to each Korean name. You need to achieve 99% correct rate on it at least.
7 Fuzii Mina Nakamoto Yuuta Song Junggi Hirai Momo Seonu Jeonga Gong Yu Bang Mina
Fuzii Mina-san Nakamoto Yuuta-san Song Junggi-ssi Hirai Momo-san Seonu Jeonga-ssi Gong Yu-ssi Bang Mina-ssi
Both the training data and the test data are generated as follows: for each case, toss a coin with an equally likely outcome to decide whether it is a Japanese name or a Korean name. If it is a Japanese name, then a surname and a forename are picked randomly from the Japanese surname list and Japanese forename list respectively; If it is a Korean name, then a surname and a forename are picked randomly from the Korean surname list and Korean forename list respectively.
In the final test data, the Japanese surname and forename list contain about 480 surnames and 2800 forenames respectively, and the Korean surname and forename list contain about 200 surnames and 2800 forenames respectively.
In the downloadable training data, the Japanese/Korean surnames/forenames list is a random subset of the corresponding one in the final test data.
These are no common surnames between the Japanese surname list and the Korean surname list, so a generated Japanese name and a generated Korean name will not be the same. However, there are a very small amount of common forenames between the Japanese forename list and the Korean forename list, so please be careful with these cases.
Right click on the link and select "save target as" or "save link as" to download. If the file name is strange, just rename it to train.txt by yourself.
Author: ZHOU, Yuchen
Source: The 18th Zhejiang University Programming Contest Sponsored by TuSimple