Tuesday, 30 November 2010

Truly international Python

Let's recall Guido's old Computer Programming for Everybody (CP4E) proposal.

Nowadays that Python is established, it's high time to push Python into education, especially first programming language education. I think, in the modern world it means pre-school.

Now the larger part of the world's children doesn't learn English before school, therefore we need to have truly localized Python.

Some might recall a Python derivative demo with unicode variable names (link anyone?).

I think we ought to go further. For example, consider imaginary language pig latin:

"""This does that""" --> """Thiso acto thato""" # docstrings
__version__ = (1,2,3) --> __versio__ = (1,2,3) # variable names
import time --> importo chrono # standard module names
def foo(): pass --> defo foo(): passo # Python keywords
"foo".upper() --> "foo".uppero() # standard library
raise Xx("undefined") --> raisio Xx("indifinito") # errors
#!/usr/bin/python --> #!/usr/bin/pythono # executable name
#!/usr/bin/python --> #!/usero/binaro/pythono # name and path

Of course there are concerns for many languages:

  • Each language needs to establish stable translations for keywords, basic types, standard modules, methods in standard modules, etc.
  • Some languages don't support word spaces natively
  • Some languages have different punctuation rules, e.g. comma for decimal point
  • Some languages use different quotes
  • RTL languages spell words RTL yet (some/all?) spell numbers LTR
  • Hopefully none has to recreate 10,000-separator system ;-)

Anyhow, it's not the issue of core Python to support particular languages, what is needed is:

  • the concept that this is needed, and
  • the base where from a particular localization can evolve from

Here, a fun example, how Python might look like in google-translate-simplified-chinese. Blame google, not me as I know very little about this language.

"""This does that""" --> """这是""" # docstrings
__version__ = (1,2,3) --> __版本__ = (1,2,3) # variable names
import time --> 进口 时间 # standard module names
def foo(): pass --> 业 美孚(): 通过 # Python keywords
"foo".upper() --> “富” 上层() # standard library
raise Xx("undefined") --> 提高。二十(“未定义”) # errors
#!/usr/bin/python --> #!/usr/bin/蛇 # executable name
#!/usr/bin/python --> #!/用户/二进制/蛇 # name and path

I track this here and will update with the received feedback:



Artem Goutsoul said...

Hmm... I think kids should first learn English, and then a computer language. Or they could do it in parallel. They'll need English anyway in modern word. Definitely more likely than any particular computer language.

However, I agree, the problem of localizing computer languages is fascinating :)

Dima Tisnek said...

Carl Johnson and Stephen Turnbull responded with old consensus "keep syntax in English, allow user-defined names in unicode"

Dima Tisnek said...

Alexander Belopolsky commented that learners prefer short unfamiliar syntax to long familiar one. I think this is quite valid.

He also offered an example of fully localized, but infinitely ugly RAPIRA language