The World’s Leading Microsoft .NET Magazine
   
 
The .NET Addict's Blog

My Top Tags

                                                           

My RSS Feeds








Latest Diggs - Programming

Internet Blogs - Blog Top Sites

Site Hits

Total: 2,672,710
since: 19 Jan 2005

Human Language Grammar vs. Computer Language Parsing... Oranges and Oranges?

posted Fri 04 Aug 06

I'm sure there are more grammar types, but of the human languages that I've encountered, there are two main kinds of grammatical structures: SOV and SVO. Subject-Object-Verb and Subject-Verb-Object. I'm about to start drawing so many parallels and analagies that its going to hurt, so brace yourself.

You can think of SOV as the Postfix notation or Reverse Polish Notation (RPN) of the human language world and SVO as the standard Infix notation that we're all familiar with. In mathematics, Postfix notation supplies the operands first, and the operator last as in 2 2 +. Infix notation, what we're all familiar with, looks like 2 + 2. This is where I'm going to start flinging some serious analogies.

Assume that the mathematical operators like + are, in fact, verbs. Then assume that the operands are subjects, objects, particles, and the other connective tissue that typically makes up a standard communicative sentence. I think this may be why I dislike the English language so much. I've always felt that it was cluttered, confusing, disorganized, and just plain hard to parse. With SOV languages (Postfix), you get all the details you need up front, and the verb, which you can often infer or assume anyway, trails along like a minor detail.

If you were interrogating someone, you might ask questions in the following order:

Who is involved?
What was it done to?
What happened?

Postfix SOV sentences give you the answer to those questions in that order. From the Wikipedia entry on RPN:

  • Calculations proceed from left to right.
  • There are no brackets or parentheses, as they are unnecessary.
  • Operands precede their operator. They are removed as the operation is evaluated.
  • When an operation is performed, the result becomes an operand itself (for later operators).
  • There is no hidden state: no need to wonder if an operator was entered or not.
  • For a computer, it is easier and faster to evaluate mathematical expressions written in RPN than mathematical expressions written in infix notation

Now, take all of these computer advantages and run them by someone who thinks logically (like a programmer). I personally find it far more efficient to communicate in SOV languages than I do in SVO languages. Does this have anything to do with the linear efficiency of Postfix notation being easier for computers use relating to the same principal making it easier for the human brain to parse and respond to information? Constructing a reply to a question formed in Postfix notation is, in my opinion, must easier and faster to accomplish in an SOV language.

Let's take a look at a couple sample sentences:

I want beer -  Subject = I, Verb = want, Object = beer.
Yo quiero cerveza - Spanish, Subject = I/Yo, Verb = want/quiero, Object = beer/cerveza

And now some Postfix/SOV sentences (disclaimer: I am fluent in neither of these languages, so there are probably errors below):

main biiyur chaahiye - I beer want (Hindi)
watashi wa biiru ga hoshii desu - I beer want (Japanese)
私わ ビール が ほしい です

Note how the most important information after the Subject, BEER, comes right at the beginning of the sentence. The verb relating to beer (want) comes at the end of the sentence, where I think it belongs.

I'm not trying to say that any one language is better or worse than any other. The point I wanted to make with this blog post is that, not only am I a lifeless geek, but I find the human brain fascinating, and I find it interesting how some human languages read more like easy-to-parse computer instructions and other human languages read like something a computer would definitely not like. Another thought I had - do people who grow up speaking more parse-friendly (logical) languages think more logically, or is that complete garbage?

Anyway, draw your own conclusions ... I'm off to parse...er...read...some Japanese ;)

tags:                    

links: digg this    del.icio.us    technorati    reddit




1. kalpna left...
Mon 21 Aug 06 2:55 pm

good article, kevin. It is "mujhe beer chahiye"(hindi) :) Kalpna


2. Tim Bedford left...
Tue 08 Jan 08 9:44 am

Interesting. I've always considered SVO to be the more logical ordering. Subject performs an action on an article. I think the main advantage of postfix notation is the ease of parsing by machine, particularly that no parentheses are required.

An area that I would expect to be interesting is the use of language in the legal profession. Here I would expect legal clauses to be more compact in SOV languages than in SVO ones. In English is notably hard to write concise logic statements.