“How many languages do you know?” is a question often asked of linguists. “Know” is a terribly imprecise word, which could potentially cover any state of knowledge from “vaguely aware of” to “native speaker”. And even in the latter case there is a wide variety of knowledge. I am a native speaker of English and an avid reader, yet I was taken aback a few years ago to see the word “shroff” in a parking garage in China. It looked like it could be English. It certainly wasn’t Mandarin Chinese. And yet it seemed like a nonsense word to me. The online Oxford English Dictionary claims to have entries for 600,000 English words, which is probably more than any one person can make use of in a lifetime, let alone use precisely according to the given set of definitions. What is of interest to most linguists is not so much the specific rules of any one language or set of languages, but rather the relationships, commonalities, and differences between the various languages, and what they tell us about the nature of language.

I haven’t gone around claiming to be a computer programmer long enough to know if “How many programming languages do you know?” is a question commonly asked of programmers. I suspect it is much less asked of programmers than of linguists, although programming languages are comparatively simpler. I suppose it might possibly be asked of computer science professors or designers of new programming languages, but for the most part we expect computer programmers to know one, two, maybe three programming languages well, and that’s about it. Likewise, we don’t expect the average non-linguist human to know more than three languages well.

Natural Language Paradigms

Anyone who learns another language finds that different languages have different ways of expressing the same or similar ideas. It’s not just a case of learning a different set of vocabulary words for the same concepts. For a native speaker of English, the word order of the clause “I read the book” might seem perfectly logical. “I” comes first because I am the one performing the action, then “read” because it’s the action, and lastly “the book” because it is the object that is being acted upon. Yet such an argument would be tautological. Why does the subject need go at the front of the clause, the object at the end, and the verb in between? Why can’t I say “read I the book”, or “I the book read”, or “the book read I”?

As you learn more about different languages, you learn that different languages don’t just have different vocabularies and different phonemes: they also have different preferences for word order, different ways to express familial relationships, different kinds of pronouns, different verb tenses, and so on. The more linguists have studied the various languages of the world, the smaller the set of “universal” features has become. However, with all languages there appears to be a tension between brevity and precision, with different languages handling different concepts differently. A concept that might be simple to express in Language A might require more complex phrasing in Language B, whereas for a different concept the reverse might be true.

Language Categories

The natural languages of the world are divided into “families”, “branches”, and individual “languages” (which might be further divided into “dialects” or “varieties”, “registers” and “jargons”, and even “idiolects”). Most of these divisions are based on a careful analysis of similar (and possibly related) languages. In the field of linguistics this type of analysis is known as comparative linguistics, which is a specialized branch of historical linguistics. However, there are ways to classify languages other than supposed historical relatedness. Word order is one way. Another common way is by morphology. There are different ways to classify languages by morphology, but I learned three general morphological classifications: analytical, inflected, and agglutinative.

Analytical languages (which include Modern English, Mandarin Chinese, and Hmong) tend to rely on word order to express the different grammatical functions of a sentence. Inflected languages (which include Latin, Sanskrit, German, and Old English) rely more on inflections (usually suffixes) to differentiate grammatical functions, making word order more flexible than in analytical languages. Agglutinative languages (which include Japanese, Korean, Mongolian, and various Turkic and Native American languages) also use affixes (again often suffixes) to differentiate grammatical functions. The difference between inflection and agglutination is that an inflection is an affix that represents several concepts at once, whereas in agglutination each affix represents a single concept. For example, the Latin verb suffix –ō includes the concepts of “first person” and “singular”, whereas in an agglutinating language the two concepts would be represented by two distinct affixes. These categories are helpful to demonstrate the different grammatical possibilities available to natural languages, but many languages don’t wholly fit in one category. Modern English, for example, although usually regarded as an analytical language, retains some vestiges of inflection.

Programming Language Paradigms

One purpose of some (although perhaps not all) university computer science programs is to expose students to different programming paradigms. Not being a computer science major, I never had that benefit until recently. I knew that goto was considered harmful. I knew that it was considered a good idea to break your code up into functions. I knew that object-oriented programming (whatever it was) was the future. And I knew that learning Lisp would make me a better programmer. However, I didn’t have a good understanding of the differences between the paradigms. None of the languages I had actually done some programming in (QBasic, System/370 Assembler, C, C++, C Shell, KornShell, Python, Ruby, and Falcon) required me to understand any programming paradigms except the procedural paradigm (and perhaps the unstructured paradigm). As I surveyed the current state of programming, I paid special attention to some of the up-and-coming languages that seemed to be at a similar level of development and popularity as Python and Ruby were when I first played with them around the turn of the millennium. I noticed that some of them (most notably Scala) were “functional” programming languages, and also that “functional programming” was invading established programming languages such as Java and JavaScript. It seemed to me that perhaps functional programming (whatever it was) was now the future.

So I embarked on a journey of discovery. Beginning about a year ago, I started studying object-oriented programming at a local community college, first in C++ and then in Java. Last summer I joined a local CodeNewbie meetup shortly before it went defunct. I had read one or two articles about the use of FizzBuzz to test for basic programming skills in job interviews, and I decided to see if I could meet that challenge, at first in a few languages I was already familiar with, such as Python, Ruby, and the POSIX Shell; and then in some newer languages I was interested in, such as Go, Scala, Julia, and Chapel. After becoming comfortable with a few procedural algorithms, I began to leverage that knowledge to explore other languages. With Scala I ran into a brick wall. It was simple enough for me to write procedural code in Scala, but I knew there was another way, and I only had a slight inkling of what that way might look like. So I decided to learn Lisp–at least well enough to solve FizzBuzz. Then came Racket, Erlang, Prolog, and Standard ML, after which I had the knowledge to write Scala in a more functional style. My FizzBuzz menagerie has grown to include numerous programs in twenty-seven different languages, and I still have languages in the queue (Pyret and maybe Elm), as well as others I feel I ought to tackle eventually (Elixir, Haskell, Mercury, R, Rebol, Forth, and APL). However, I can’t say I’m unsatisfied with the size of my collection, and it’s high time I focused on depth rather than breadth. But before I move on entirely, I would like to take some time to reflect upon what I’ve learned.

Language Categories

Just as with natural languages, programming languages can be divided into various innumerable categories. For some reason I like to divide them into binary categories. The first categories are unstructured and structured. Unstructured languages include most assembly languages, some forms of BASIC, and possibly others–basically any language where the use of goto is common practice. In structured programming programs are divided into smaller blocks of code. I divide structured programming languages into imperative and declarative languages. Without defining the two categories in precise detail, imperative languages tend to describe how to accomplish a task using control flow statements, whereas declarative languages tend to describe what task to accomplish without specifying how it is to be accomplished. The imperative category may be subdivided further into procedural and object-oriented languages, and the declarative category may be subdivided into functional and logic languages. Well-known procedural programming languages include ALGOL, C, Pascal, and Ada. Object-oriented programming languages include Simula, Smalltalk, C++, Java, C#, Python, and Ruby. Functional programming languages include Lisp, Scheme, Clojure, Standard ML, OCaml, Erlang, Scala, and Haskell. Logic programming languages include Prolog and Mercury. In fact, many programming languages are “multi-paradigm”. That is, you can write code using different paradigms in the same language. However, as far as I am aware, all languages have a preferred paradigm, with the exception of Oz, which was specifically designed to teach different paradigms. Feel free to quibble over my categorization of programming paradigms, or my placement of certain languages in a particular paradigm. Categorization is not an exact science, as far as I am aware.

In future posts I intend to go more in depth into the various programming paradigms, with examples in actual code, as well as to delve into differences and similarities between various languages that may exist outside of the usual paradigms.


Post a comment:

Name
E-mail (optional)
Website (optional)
Message (Markdown allowed)
Comments will appear after moderation.