SilasBarta comments on Open Thread, September 15-30, 2012

SilasBarta 17 Sep 2012 4:55 UTC
18 points
I’m trying to find the existing research on the topic Paul Graham discusses in this article (regarding the relative merits of programming languages in footnote 3 and surrounding text) and which EY touches on here (regarding Turing tarpits).

Basically, within the realm of Turing-complete languages, there is significant difference in how easy it is to write a program that implements specific functionality. That is, if you want to write a program that takes a bunch of integers and returns the sum of their squares, then it’s possible to do it by writing in machine code, assembly, brainfsck, BASIC, C, Java, Python, Lisp, but it’s much easier (and more concise, intuitive, etc) in some than others.

What’s more, Graham speculates there’s a ranking of the languages in which programmers too comfortable in one of the languages “don’t get” the usefulness of the features in languages above it. So BASIC-addicts might not appreciate what they can do with recursion or structured programming (i.e. by abolishing go-to statements); C-addicts might not appreciate what they can do with functions as first class objects, and Python-addicts might not appreciate what they can do with Lisp macros.

I’m interested in research that formalizes (or deconstructs) this intuition and attempts to come up with a procedure for comparing programming languages in this respect. The concept is similar to “expressive power”, but that’s not exactly the same thing, because they can all express the same programs, but some can span a broader array of programs using fewer symbols (due to how they combine and accumulate meaning faster).

Also, the theory of Komogorov complexity holds that “when compressing data, choice of language doesn’t matter except to the extent that it adds a data-independent constant equal to the length of a program that converts between the two languages”; however, this still allows that some programs (before the converter) are necessarily longer, and I’ve heard of results that some Turing-complete formalisms require exponential blow-up in program size over e.g. C. (This is the case with Wolfram’s tiny Turing-complete language in A New Kind of Science.)

tl;dr: I want to know what research is out there (or how to find it) regarding how to rigorously evaluate programming languages regarding relative ease and brevity of writing programs, and in what senses python is better than e.g. assembler.
- Viliam_Bur 24 Sep 2012 15:27 UTC
  10 points
  Parent
  
  Graham speculates there’s a ranking of the languages in which programmers too comfortable in one of the languages “don’t get” the usefulness of the features in languages above it. So BASIC-addicts might not appreciate what they can do with recursion or structured programming (i.e. by abolishing go-to statements); C-addicts might not appreciate what they can do with functions as first class objects, and Python-addicts might not appreciate what they can do with Lisp macros.
  
  If “enlightening people about better programming languages” ever becomes a higher priority than “enlightening people about superior status of X language users”, I think a good strategy would be to explain those possible insights in a simplest possible form, without asking people to learn the guru’s favorite programming language first.
  
  For example, to show people the benefits of the recursion, I would have to find a nice example where recursion is the best way to solve the given problem; but also the problem should not feel like an artificial problem created merely for the sake of demonstrating usefulness of recursion.
  
  I can use recursion to calculate the factorial of N… but I can use a for-cycle to achieve the same effect (without risking stack overflow). Actually, if my math happens to be limited to 64-bit integers, I could just use a precomputed table of the first few results, because the factorials of greater numbers would overflow anyway. This is why using factorial as an example of usefulness of recursion is not very convincing. A good example would be e.g. recursively parsing and then evaluating a mathematical expression. -- The idea is that even if some concept is powerful, your demonstration of it may be wrong. But providing no demonstration, or very esoteric demonstration, is also wrong.
  
  Alternatively, we could hold annual competitions between proponents of different programming languages. For example they would have to program a Minesweeper game, and would be rated by (a) the speed of finishing the product, and (b) the elegance of their code.
  
  At this point, proponents of the X language may object that coding Minesweeper is not the fair way of comparing languages, because their language is better for some other tasks. But this argument cuts both ways. If my goal is to write a Minesweeper game, then perhaps Paul Graham’s (or anyone else’s) opinion about the best programming language is not relevant to me, here and now. Perhaps call me again when my ambitions change. Or tell me in advance what kinds of projects should I do in the X language, and what kinds of projects will rather easily work in any language.
  - TheOtherDave 24 Sep 2012 16:38 UTC
    1 point
    Parent
    
    Or tell me in advance what kinds of projects should I do in the X language, and what kinds of projects will rather easily work in any language.
    
    This.
- Risto_Saarelma 18 Sep 2012 13:06 UTC
  2 points
  Parent
  Have you dug around the publications of Alan Kay’s Viewpoints Research Institute? They’re trying to really push the envelope with the expressive power of highly customized programming languages.
  
  ETA: A LtU discussion thread about a recent VPRI progress report.
- eurg 17 Sep 2012 11:37 UTC
  2 points
  Parent
  I haven’t heard of any studies in that direction, although a few people try to do find something, like “how long are the programs comparatively” etc., similar to this quickly googled IEEE paper.
  
  I assume that because
  - programming languages are used by humans, and
  - most of a programming language’s quality is based upon its actual effects on humans in the target group, and
  - for real work we value long-term effects
  such a study is unfeasible, i.e. too much effort, too expensive. And probably not that much related to programming language features (as most popular languages converge to some degree, especially on the most important axises). Also, “fewer symbols/lexemes/special-purpose-constructs” is an advantage only under certain conditions, meaning: the question asked may very well already determine the answer.
  - [deleted] 17 Sep 2012 20:19 UTC
    1 point
    Parent
    That’s the short version. The full paper is here. I found it while looking for a similar comparison that I remembered seeing mentioned several times when I had been interested in Common Lisp and it turned out to be a follow-up to that. Oh, and those things actually looked at time spent programming, so they didn’t measure only silly things like program length.
    - SilasBarta 18 Sep 2012 3:05 UTC
      1 point
      Parent
      Why is program length a silly thing?
      - [deleted] 20 Sep 2012 1:10 UTC
        2 points
        Parent
        It was excessive to call it “silly” but program length still seems very imperfect way to measure the ease of writing programs in a given language. Better to directly measure things like programming time or number of bugs.
      - Morendil 18 Sep 2012 23:42 UTC
        0 points
        Parent
        It’s silly when you’re measuring it in “lines of code”, because “line” is a somewhat arbitrary construct, for which “chunks of text delimited by newlines” is a worse approximation than most people think. (Quick proof: in many languages, stripping out all the newlines yields an equivalent program, so that all programs are effectively one-liners.)
        SilasBarta 18 Sep 2012 23:47 UTC
        2 points
        Parent
        Then it’s a good thing I didn’t measure it that way, or use that term in this entire thread! Whenever I did refer to measures of program length, it was with constructions such as:
        
        some can span a broader array of programs using fewer symbols (due to how they combine and accumulate meaning faster)
    - eurg 17 Sep 2012 23:09 UTC
      0 points
      Parent
      Thanks for digging that up! Also, I did not want to imply that only silly things are measured, but rather that the most interesting questions are still unanswered due to various constraints.
- Pentashagon 25 Sep 2012 19:04 UTC
  1 point
  Parent
  Do you mean only programming languages or programming languages plus the commonly available standard libraries? Some programming languages are very simple and powerful (LISP, Haskell), and some provide a large standard library and other tools to make it easy and straightforward to get things done in specific problem spaces (MATLAB, VB).
  
  The most powerful and concise language I can imagine would be in the language of set theory with a magic automated solver: Define the domain of a problem space and a predicate for solutions and the result is the set of solutions. A standard library built on such a language would consist mostly of useful definitions for converting real world problems into mathematics and back again.
  
  I think most programming languages try to approximate this to some degree, but the programmer is necessary to fill in the algorithmic steps between the problem space and the solution space. The fewer explicit steps the programmer has to specify to write a function, the more concise and powerful the language. The fewer definitions and functions necessary to convert real world input into a mathematical model and back to real world output, the more concise and powerful the standard library is.
  - SilasBarta 25 Sep 2012 22:54 UTC
    0 points
    Parent
    Graham addresses the point you’re making about the difference between the language being powerful vs it having a large standard library. As he says in the link, by his metric, two languages are “at the same level” if they only differ by one of them lacking a library function that the other has. His test for proving that language A is “above” B is the question: “To do a task in B, do you have to write an interpreter for A (or language on its level)?” For example, adding recursion to Basic requires going beyond writing one more library function.
    - Pentashagon 10 Oct 2012 16:12 UTC
      −2 points
      Parent
      After thinking about this for (a long) while, I think the most powerful languages would then be the ones that are internally extensible and allow modification of their own syntax and semantics. For instance BASIC would become a far more powerful language than C or LISP if it had a function for altering the interpreter itself. Certain versions of BASIC effectively had this by allowing functions to be written in machine language and then executed. Theoretically, the machine language snippets could extend the interpreter to add structured programming features or first class functions. The same could potentially be done with C by just including the Tiny C Compiler in the source (as both the compiled functions and C strings containing the source) and then reflexively recompiling the compiler to add features.
      
      What is most interesting to me is that making a language that can modify itself requires a complete definition of its execution environment in order to be safely extensible. The most powerful languages would have to fully and formally define their semantics as well as their syntax and make both accessible within the language so that extension of the syntax could could safely extend the semantics to match. In other words BASIC with assembly language is not enough if you don’t know everything about the hardware and the interpreter you start with. From my CS student days I seem to recall reading that few programming languages have rigorously defined semantics (“This feature is implementation specific” and “This behavior is undefined” appears far too often in most specifications). The closest thing I can find is the Gödel machine, but that’s defined directly in terms of a Turing machine, afaict.
- mstevens 17 Sep 2012 10:25 UTC
  0 points
  Parent
  I don’t have any answers, but I’m also interested.