Sunday, June 13, 2010

Children of Prolog

Recently, I have become interested in functional programming, largely because of the remarkable success of Erlang in producing robust web applications. Erlang seems to have 'solved' concurrent programming, which is one of the growing challenges in programming - especially as we move into a post Web 2.0 world. Erlang was initially implemented in Prolog.
Today, I listened to Joseph M Hellerstein's Keynote address at ACM PODS 2010, The Declarative Imperative: Experiences and Conjectures in Distributed Logic. The slides are also available. He uses Daedalus, a derivative of Datalog. Datalog is, syntactically, a subset of Prolog. Hellerstein also addresses the concurrency conundrum, a.k.a., the Programming Crisis by retelling children's classics

“The sky is falling! The sky is falling! Computers won’t get any faster unless programmers learn to write parallel code!” squawked Chicken Licken.

Henny Penny clucked in agreement: “Worse, there is Cloud Computing on the horizon, and it requires programmers to write parallel AND distributed code!”

This is more fun than your average academic talk. Of course, he then goes on to provide his solution to the programming crisis. He shows how Datalog has been used to solve a series of practical problems in computer science.
The solutions Hellerstein expounds rely upon declarative programming in Datalog variants. He describes a series of practical code for distributed crawlers, networking protocols, query optimizers, distributed file systems and a map-reduce scheduler. What is as impressive as the breadth of topics is his claim that his declarative solutions have code bases that are orders of magnitude smaller that existing, imperative solutions.
His code samples describe solving problems with a time dependence that evolves in steps, like a turn-based game. This was a little disconcerting to a physicist, since we like to think about time as continuous, still it was interesting to find a computer scientist shift from 'practical' solutions to a rather abstract discussion on the meaning of time and causality. At this point, I am starting to think about the InfoQ talk by Rich Hicky on Persistent Data Structures and Managed References.
From a Prolog beginning, Ericsson developed Erlang. Erlang is having a renaissance that is leading to high concurrency solutions for web programming. Next, we have Hellerstein using Datalog to address the programming crisis. Where are the imperative, OO solutions? Declarative logic & functional programming - are these the tools that will lead out out the the programming crisis?

Saturday, June 5, 2010

Trends in programming and Amdal's Law

in my previous blog, I explained trends I saw that will affect programmers. The main takeaway is that we are moving into a world of concurrent programming. If we do concurrent programming with memory shared by several threads of execution, life is going to be miserable for programmers. Fortunately, message passing via actors is a viable path to concurrency that is much easier to program.
In any discussion of concurrency, the iron law is Amdahl's Law. If you have a process in which some fraction, s, of the steps are sequential and the rest, 1-s, can be done in parallel, then the most you can gain from concurrency over N processors is to reduce the time to execute from 1 to s + (1-s)/N. So basically, the time to execute a program is limited by s as much as by N. The hardware determines N and the software determines s. So what determines s? Basically, s is the fraction of the algorithm that you spend queueing stuff. If I go to the grocery store, I can shop in parallel with everyone else. It takes a little longer to shop in a busy store, but only a little longer. But when it is time to check out, I have to queue up to pay.
What is exciting is that by using techniques like 'no shared state' and Actors for communication can greatly reduce s, so we can write software that efficiently tanks advantage of concurrent hardware (i.e., big N). If we look at a standard sort of web applications, the bottlenecks are things like writing to the database. The first thing to do is to replace a single server with a cluster of identical servers, so that a given client can access any one of the servers and get the job done.
But there turns out to be a better plan - sharding. In sharding, you still have a cluster of N servers, but you don't have each server having identical data. If I have a large database, I would shard it by creating N servers, but I would shard the data so that any given server has only a fraction of the data. This is much like partitioning data on a RAID array, and the more sophisticated designs take into account 'k safety". Reed-Solomon erasure codes exhibit this behavior, as do Tornado codes. This means that the data is divided in to n+k partitions so that any set of n partitions can reconstruct the entire data set. This means you can loose k partitions before you loose any data. This idea is central to VoltDB, but they implement it by having k+1 copies of each block. It seems that a more efficient design could be built using erasure codes (See exercise 11.9 in David MacKay's Information Theory, Inference and Learning Algorithms.)


Wednesday, June 2, 2010

Trends in programming

There is a dizzying wealth of new options for web programmers. I am trying to understand why there are so many changes and which of those changes I should be choosing to learn. I have found that the answer to this sort of question often comes from looking backward and trying to judge current trends in an historical context.
Hardware: Multicore is the future
This is an inevitable trend, as carefully explained in The Future of Computing by Anant Agarwal to MIT freshman. The power consumption of a CPU scales as the cube of the voltage. So trying to improve performance by brute force is a loosing game. Instead, it is much better to build smaller, lower power cores. Prof. Agarwal is among those who are laying the foundation for a future of RISC cores arranged in a tiled multi core architecture. As a programmer, you can get much better performance if you can leverage concurrency. We can leave the hardware issues to the likes of MIT, but the programming challenges will affect everyone in who programs.
To better understand hardware and how it affects software, listen to A Crash Course in Modern Hardware by Cliff Click of Azul. Azul has systems with 864 processor cores, so they are on the front lines of this revolution. He explains how modern computers, with caches and pipelining, are not particularly well described by the von Neumann architecture. The entire talk seems like an explanation of how hardware engineers have been addressing the von Neumann bottleneck between CPU and memory. Even here, we find that each physical core is running concurrent processes.
So, we are seeing horizontal scaling of computers in clusters, with each computer having more cores and with each core handing more concurrent tasks. I can spot a trend - concurrent processing is the future. But I HATE threading and locks.
Functional Programming and Concurrency
Erlang was developed for high reliability in Ericsson telephone switches. Joe Armstrong describes how they tried to develop Systems that Never Stop and why that lead to Message Passing Concurrency in Erlang. Armstrong came up with a list of six laws for reliable systems. The laws/requirements are:
  1. isolation (the probability of isolated systems failing is independent. So processes need to be isolated.)
  2. concurrency (do you need to worry about how many concurrent processes can you support? You need at least two systems to make a non-stop system. Even a two computer system is concurrent. You can't avoid concurrency)
  3. failure detection (if you can't detect a failure, you can't fix it. the thing that crashed can be the thing that detects failures.)
  4. fault identification (You need to figure out why there was a failure if you want to fix it.)
  5. live code upgrade (Systems that should never stop, need live code upgrade)
  6. stable storage (no backups, implies multiple copies. must keep crash logs)
Erlang is a functional language, so there are no side effects to calculations, Joe Armstrong discusses the importance of Functional Programming in Functions + Messages + Concurrency = Erlang. There really is no point in arguing about if he is right. Erlang is the proof. It is reliable, Ericsson has the AXD 301 switch with two million lines of code: BT operates these switches at a 'nine-nines' reliability - it can fail for 32 ms each year. BT has run switches for six years without stopping. Erlang also allows for high concurrency, this is clearly illustrated in Apache vs. Yaws: while Apache and Yaws have similar throughputs for score of concurrent clients, Apache bogs down at around 4 thousand parallel sessions. By contrast, the Erlang powered Yaws can serve over 80 thousand parallel sessions on Linux with similar hardware. Apache is a well written server, Netcraft shows that Apache is easily the #1 server on the World Wide Web in May 2010. But Yaws beat it handily in this measure. In RESTful Services with Erlang and Yaws, Steve Vinoski examines Yaws in more detail.
But perhaps Erlang is hard to use and expensive to code? According to Ulf Wigner, Ericsson recorded a Four-fold increase in Productivity and Quality on the ADX 301 project. Joe Armstrong is very optimistic about the future of Erlang, and I think he is right.
When I looked into this some more, I realized that Erlang was not the 'bolt of the the blue' that is first seemed to me. Some of the greatest minds in programming have been noticing that imperative programming has been at something of a standstill. Read John Backur's 1977 ACM Turing Award Lecture, Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs. This is a broadside attack on imperative programming is all the more remarkable because it is coming from the author of FORTRAN. He then goes on to explain what he sees are the advantages of functional programming. This must have been a controversial paper, Edsger Dijkstra critique of Backur's lecture ended with:
In short, the article is a progress report on a valid effort but suffers badly from agressive overselling of its significance, long before convincing results have been reached.
I tend to believe both Backur and Dijkstra. I resolve any apparent conflict by admitting that Backur was ahead of his time.