Computers attached to the Net are assigned a unique Internet Protocol number, a difficult-to-recall string of eight to 12 numbers. The IP address helps ensure that Web information reaches its intended source. Since names or phrases are easier to remember, Web scientists came up with the domain name system, where an IP address is matched to a unique name, such as upi.com. Special computers, called DNS servers, keep lists of the names and their corresponding numbers, making the task of Web surfing a lot easier.
John Klensin, AT&T's vice president of Internet architecture and a longtime Internet scientist, spoke about DNS at a New York meeting with AT&T Bell Labs specialists. He said he has spent a "frightening" amount of time the past couple of years considering the system's international implications.
"The latest manifestation of this has been the worldwide panic over the lack of ability to write domain names in other than Roman-based characters," Klensin said. "We somehow slipped over a boundary between these (names) as simple identifiers and as words and mnemonics and trademarks." The difficulty lies in creating a set of matching rules to ensure a certain string of letters translates or "maps" to only one IP address, regardless of case -- early computer navigation systems were case sensitive -- leading to people ending up at unintended destinations. Today, it does not matter if people type yahoo.com, Yahoo.com or YAHOO.COM, they still get to the search engine site. But even that system encounters problems before leaving English, Klensin said. "You have people wanting British and American spellings to match, people who think it's ridiculous to have to remember whether 'joe's-pizza.com' is spelled with or without a hyphen," he said. "As we move towards other character sets, these problems turn from very irritating and confusing to total breakdown."
In French, for example, an accented 'a' loses the accent mark when it is uppercase -- except in Quebecois French. In the caseless Web, this less-than-absolute rule could lead to someone typing a domain name with an accented 'a' and the DNS would have no way to decide if the French-Canadian site or the European French site was the intended destination. There also are cases where geographically adjacent languages, such as Swedish and German, have dissimilar written characters that mean the same thing. Then there are the Asian languages, where even attempts at a common character set yield different meanings in each country. Again, the worldwide DNS has no way of accommodating these circumstances. "This complex, expansive set of problems isn't the identifier problem we were dealing with 20 years ago," Klensin said. "It's language and culture, things about which people are extremely sensitive and go to war over." Other countries are concerned that their populations do not speak English, Klensin said. They are painfully aware that educating everyone in a second language is nearly impossible, particularly where literacy in the mother tongue is low.
Solving this dilemma will mean changing how we describe the names of things on the Web, he said. In doing so, computer scientists will have to alter the interface humans use to communicate with their PCs. The Internet Engineering Task Force is working on the problem, but Klensin said the group might come up with only a part of the final solution or solve the wrong problem.
"The solution will require more rules about how you use language, which might be incomprehensible and insulting to most of the users of those languages," Klensin said. "We're going to end up with more search-based technologies ... in ways which put some of these choices back in the hands of human beings."
For example, navigating the Web could become a series of steps quite similar to using a search engine. Type a domain name in any language and the computer would find possible matches and ask the user if that is what he really wanted, he said. The idea of browsing through links of information is not going to go away, however.