Comments | timwi: About the fixation of syntax to concepts

timwi

About the fixation of syntax to concepts

Apr 19, 2010 13:04

Coders seem to be unable to distinguish the concepts of a programming language from its syntax. One coder might argue for Python on the grounds that it has certain innovative language features, but another might “refute” this by arguing about the shortcomings of whitespace-based syntax. The two are talking about entirely separate things ( Read more... )

Comments 4

pne April 19 2010, 12:29:04 UTC

I would like to see the next step in the evolution of programming to be one where everyone can use any syntax of their choosing without impacting anyone else. In this world, source code would be parsed into an abstract syntax tree, and it would be this syntax tree that would be shared between developers and checked into source control, so everyone can see everyone else’s work in one’s own preferred syntax.

I would argue that this is impossible unless you only allow syntax that is isomorphic to the AST.

In general, multiple pieces of syntax can map to the same bit of code; for example, in C you might have if (boolvar) dostuff(); or if (boolvar) {dostuff();} (presence or absence of braces around a single instruction); in Perl you might have $count++ if $a > 10 or if ($a > 10) { $count++; } (postfix conditional or block conditional... with optional semicolon in the latter case), etc. etc ( ... )

timwi April 19 2010, 15:14:13 UTC

I agree that I have glossed over certain details. Let me just say that your example is a bad one because I can’t really imagine that many people would want to maintain a distinction between “if (boolvar) dostuff();” and “if (boolvar) {dostuff();}” - you’d generally prefer either always one or always the other. But I agree there is a the more general issue. A better example might be, for example, that I might want to insert blank lines to group statements into logical blocks. With current programming languages, the AST doesn’t reflect those blank lines. Come to think of it, current AST don’t include comments either ( ... )

pne April 19 2010, 15:47:53 UTC

Perhaps a better example would be "syntactic sugar" -- for example, in Perl, while (<>) { ... } is equivalent to while (defined($_ = <>) { ... }, and in Java, java.util.Collection varlist = ...; for (Integer var : varlist) { var = var + 1; } is short for something like java.util.Collection varlist = ...; java.util.Iterator $tmpiter = varlist.iterator(); while ($tmpiter.hasNext()) { Integer var = (Integer) $tmpiter.next(); var = new Integer(var.intValue() + 1); }Sometimes you'd want to see autoboxing and auto-unboxing; sometimes you'd like to see explicit casts resulting from generics; sometimes you'd like to see "foreach" with explicit iterators -- but often, you might not want to see those things. (After all, part of the reason for syntactic sugar is to make code more readable by encapsuling a common idiom into a shorter form ( ... )

timwi April 19 2010, 17:40:50 UTC

The meanings of “while (<>)” and “while (defined($_ = <>))” may be the same, but the parse tree certainly isn’t. I used the term AST assuming it is the same thing as parse tree, but if the two terms are used slightly differently, maybe I should edit the post and call it parse tree. I’m thinking of the result of the very first step in a compiler, which is a simple parse according to a meaning-agnostic grammar - no substitutions of syntactic sugar or expansion of fully-qualified names. It is probably the case that many compilers do such expansions/substitutions at the same time as parsing, but there’s no reason why it can’t be separated and you could still have a parse tree that contains the import/using clauses and that distinguishes the for/foreach loop from its expanded iterator pattern. There is no “automatic retranslation”, most certainly not from “object code” (compiled binaries).