People that don’t think in computer languages often have no need to parse everyday language that’s used around them thoroughly – usually it’s enough to just have broad understandings – a lot like maths really. I still remember taking my linux laptop into a cafe and being unable to connect to their free wifi, despite having the password given to me – “trotters cafe” . I watched the transactions between my wireless adapter and the modem and read the logs, and saw that I was unable to connect because the password was actually a passphrase. The software that drove my wireless adapter would stop reading the passphrase at the end of the first word.
As far as the developers of the wireless adapter’s software (the “driver“) were concerned, a password was a type of word, and traditionally words are demarcated by spaces; the developers of the modem’s software had been somewhat looser in their use of language – although I would suggest this was lazy rather than deliberate. I can imagine a dev thinking “it does ask for a password“. Finally, the cafe was using two words as it’s password (hence, a passphrase) – what do they care, so long as it works (…for most people, ie, Windows and OSX users)?
Recently there has been an issue with the introduction of the new social networking site Google+. G+ are insisting on users using their “real names” to join Google+ and this has lead to what has become known as the Nymwars. There are plenty of compelling political reasons why this is a flawed policy – jwz, Netscape and Mozilla founder and writer of most of your screen savers, is pleasingly brief on the whole problem of real names or for the more esoteric there is the obtuse political reasoning from the Neoist art movement in the 80s and 90s who used multiple use names such as Luther Blissett.
Which brings me to this interesting page on the W3C Consortium site – Personal names around the world. This W3C page pretty well sums up why this problem is made so difficult from a more practical perspective
intended audience: HTML content authors (using editors or scripting), script developers (PHP, JSP, etc.), schema developers (DTDs, XML Schema, RelaxNG, etc.), Web project managers, and anyone who is involved in the design of forms, databases and ontologies that capture people’s names.
Their background makes the most obvious point – that it’s up to the developers at the end of the day:
People who create web forms, databases, or ontologies are often unaware how different people’s names can be in other countries. They build their forms or databases in a way that assumes too much on the part of foreign users. This article will first introduce you to some of the different styles used for personal names, and then some of the possible implications for handling those on the Web.
This article doesn’t provide all the answers – the best answer will vary according to the needs of the application, and in most cases, it may be difficult to find a ‘perfect’ solution. It attempts to mostly sensitize you to some of the key issues by way of an introduction. The examples and advice shown relate mostly to Web forms and databases. Many of the concepts are, however, also worth considering for ontology design, though we won’t call out specific examples here.
There are a couple of key scenarios to consider.
- You are designing a form in a single language (let’s assume English) that people from around the world will be filling in.
- You are designing a form in one language but the form will be adapted to suit the cultural differences of a given locale when the site is translated.
In reality, you will probably not be able to localize for every different culture, so even if you rely on approach 2, some people will still use a form that is not intended specifically for their culture.
The obvious question that this begs is that Google has admitted that this is all about delivery of content (names and profiles) to clients (advertisers) in return for money. Surely Google is aware that not everyone uses English? And that their user base will span the planet? It doesn’t seem like the best policy for gathering the most number of content.
In any case, here are just some of the issues that need to be taken into account when designing the name field in a software application:
In the Icelandic name Björk Guðmundsdóttir Björk is the given name. The second part of the name indicates the father’s (or sometimes the mother’s) name, followed by -sson for a male and -sdóttir for a female, and is more of a description than a family name in the Western sense. Björk’s father, Guðmundur, was the son of Gunnar, so is known as Guðmundur Gunnarsson.
Icelanders prefer to be called by their given name (Björk), or by their full name (Björk Guðmundsdóttir). Björk wouldn’t normally expect to be called Ms. Guðmundsdóttir. Telephone directories in Iceland are sorted by given name.
In the Chinese name 毛泽东 (Mao Ze Dong) the family name is Mao, ie. the first name when reading (left to right). The given name is Dong. The middle character, Ze, is a generational name, and is common to all his siblings (such as his brothers and sister, 毛泽民 (Mao Ze Min), 毛泽覃 (Mao Ze Tan), and 毛泽紅 (Mao Ze Hong)).
Spanish-speaking people will commonly have two family names. For example, María-Jose Carreño Quiñones may be the daughter of Antonio Carreño Rodríguez and María Quiñones Marqués.
You would refer to her as Señorita Carreño, not Señorita Quiñones.
For example, the wife of Борис Николаевич Ельцин (Boris Nikolayevich Yeltsin) is Наина Иосифовна Ельцина (Naina Iosifovna Yeltsina) – note how the husband’s names end in consonants, while the wife’s names (even the patronymic from her father) end in a.
Americans often write their name with a middle initial, for example, John Q. Public. Often forms designed in the USA assume that this is common practice, whereas even in the UK, where people may indeed have (one or more) middle names, this is often seen as a very American approach. People in Korea, who typically do have 3 names but who don’t usually initialise them, may be confused about how to deal with such forms. Bear in mind, also, that many people who do use an initial in their name may use it at the beginning.
It would be wrong to assume that members of the same family share the same family name. There is a growing trend in the West for wives to keep their own name after marriage, but there are other cultures, such as China, where this is the normal approach. In some countries the wife may or may not take the husband’s name. If the Malay girl Zaiton married Isa, mentioned above, she may remain Mrs. Zaiton, or she may choose to become Zaiton Isa, in which case you might refer to her as Mrs. Isa.
You should also not simply assume that name adoption goes from husband to wife. Sometimes men take their wife’s name on marriage. It may be better, in these cases, for a form to say ‘Previous name’ than ‘Maiden name’ or ‘née’.
For example, Velikkakathu Sankaran Achuthanandan is a Kerala name from Southern India, usually written V. S. Achuthanandanwhich follows the order familyName-fathersName-givenName.
In Vietnam, names such as Nguyễn Tấn Dũng follow the order familyName-middleName-givenName. Although this seems similar to the Chinese example above, even in a formal situation this Prime Minister of Vietnam is referred to using his given name, ie. Mr. Dũng, not Mr. Nguyễn.
Ideographic characters in Japanese names can typically be pronounced in more than one way. In some cases this makes it difficult for people to know exactly how to pronounce a name, and also causes problems for automatic sorting and retrieval of names, which is typically done on the basis of how the name is pronounced. For example, the family name of 東海林賢蔵 (ie. the first three ideographic characters on the left) may be transcribed or pronounced as either Tōkairin or Shōji.
The article goes on to note the implications for field design – to split or not to split?; strategies for splitting up names, other things like hyphenation (and other punctuation) and localization of the database at the backend, and even goes so far as sorting and honorifics. It’s a comprehensive overview of the issues that face developers when constructing software that requires names