Monday, September 10, 2007

Evolving Beyond One-Dimensional Programming; Why Foo-C is not a Parallel Programming Environment

In my last blog entry, I introduced the spreadsheet parallel programming model that I will be evangelizing for the next several years. The goal of this entry is to contrast the spreadsheet programming model with existing attempts at C-like parallel programming environments. I argue that one-dimensional von Neumann languages are insufficient for expressing the objectives of a parallel program and that any "parallel programming" extension to C is a temporary solution to a bigger problem.

A programming objective is most generally one of two things: a functional procedure or a state modifying atomic action. In hardware, these notions respectively correspond to resolving a combinatorial circuit and latching a register. In software descriptions these correspond to evaluating a pure-function and assigning the result to the left-hand-side variable. Often these concepts are not separated in the mind of the sequential imperative programmer; each piece of code is thought of as a sequence of instructions that is executed as it is read. This programming model matches the hardware most software engineers have used for the past 60 years and so it is the dominant paradigm.

In the sequential school of thought, functions are just subroutines which return a value. Since most languages do not strictly isolate state assignment from functional value determination, the order in which functional computations and assignments are performed matters to the correctness of the program. Thus a program is generally some contortion within a language to achieve either a functional or atomic-action objective. In this sense, the programmer is devising not a solution to a problem, but an expression of the solution in some language.

Software programmers rarely create code the way we think about the problem, but rather we solve the problem in the manner that the language provides. Computer programming has predominantly been the engineering practice of projecting combinatorial functions or atomic actions into a one-dimensional execution order. The work of a parallelizing compiler may be considered a process of "re-functionalizing" the one-dimensional projection produced by the programmer. Unfortunately, sequential imperative control flow confounds a parallelizing compiler and introduces a bottleneck which cannot easily be anti-projected from one-dimension to many. Speculation seems to be the mode for improving the concurrency of our code, though parallelism is not achieved through brute-force speculation alone since speculation is an inefficient parallelization primitive when we explore too many unlikely paths.

Herein lies the key difference between software and reconfigurable hardware programming. Software is a control-centric model and reconfigurable computing is a data-centric model. Any sort of programmatic control flow in reconfigurable hardware must be explicitly managed by state-machines, whereas control flow in an instruction stream is implicit (next instruction) or managed with branching. The C programmer mangles what could have been a multiplexor into control flow case statements. An if statement only evaluates one path after the predicate resolves while a multiplexer resolves all inputs simultaneously while the predicate is being determined. Expressing this sort of parallelism is impossible in C since simultaneity is not a component of the language.

Ultimately the problem of a parallelizing compiler resolving C-like control flow is this: when do we want our conditional statements to be an "if" and when do we want them to be a muliplexer? By writing code in a von Neumann language we are forcing one-dimensionality in our code and explicitly sequentializing what could otherwise be a parallel expression.


Anonymous said...


I am new to the topics at hand hence the following questions will probably seem basic.

a. How does developing in a spreadsheet environment helps the user besides giving him a more parallel oriented view?

b. How do you see scheduling for the different cells being set in the spreadsheet framework?


Amir said...

a) a simple parallel mindset is the whole point! A spreadsheet is a better reconfigurable computing array than any FPGA hardware. My goal is to bridge that gap. Parallel programming is hard because people don't think of dynamic dataflow as a spreadsheet. Issues like vectorization, partitioning and place and route are obviously expressed in a spreadsheet and not so in a sequence of instructions. Things like non-blocking and blocking assignments which are so often confused by Verilog developers are easier to understand and visualize in a spreadsheet. Once I "got it" I started writing better RTL Verilog.

b) In my thesis I implemented scheduling in a "schedule sheet" which lists ranges of cells, their activity lists and the mode of interpretation (blocking, non-blocking, asynchronous, reading order execution). Efficiently partitioning and synchronizing distributed global memory among co-processors is a "hard problem."

Anonymous said...

Is it comparable to a data flow design tool such as Simulink (which can also generate HDL code)?

If so what are it's advantages?

Amir said...

Thanks for your comments!

From a programming perspective, dataflow is dataflow anyway you cut it. From a tool perspective it is easier to produce the block-oriented design motif of Simulink in a spreadsheet than it is to achieve the array-oriented design motif of a spreadsheet in Simulink. That is to say that Excel is more Simulink-like than Simulink is Excel-like.

Anonymous said...

I really like when people are expressing their opinion and thought. So I like the way you are writing