Tuesday, August 22, 2006

some ramblings about stuff

Starting to think out loud:

Taking advantage of paralellism is a necessary step for the advance of computing. Chip-multiprocessors (CMP) and FPGAs enable concurrent computation paradigms. CMPs allow for thread-level parallelism (TLP), where each core contains an instruction cache consisting of the current set of supported functions. Reconfigruable datapath arrays allow for networks of fixed functions to be interconnected through routers to take advantage of instruction level parallelism (ILP). FPGAs offer even finer granularity of control allowing a network of reconfigurable functions often enabling bit level parallelism (BLP) in addition to ILP and TLP.

The granularity of reconfigurability also has implications to data locality. If we have fine grained control over where we may define memory and registers in our array, then we may localize our variables near the operations that use them. Since memory bandwidth is the primary cause of the "von Neumann bottleneck," on-chip data locality provides a solution.

The cost of reconfigurability is the amount of area required per operation, which implies a lower clock frequency and higher power consumption per function when compared to a fixed implementation. Still it is often impractical to have a fixed-ASIC implemenation for all computing functions. Thus we are left to choose between a reconfigurabile chip and a general purpose CPU. A reconfigurable device can often levarage parallelism to achieve a decrease in total execution time and total power consumption over general purpose microprocessing units.

It may not be the case the a reconfigurable device wins over a general purpose CPU. If a CPU device is more suitable for an application it would be wise to use it. A mixed granular structure incorporating reconfigurable logic within an array of microprocessors can maximize performance by alleviating speed issues for explicitly single threaded applications that cannot leverage BLP, ILP or TLP.

My goal is to create a self optimizing operating system for such reconfigurable heterogeneous arrays. Compatibility with current computing paradigms is a primary objective to minimize barriers to adoption. To maintain compatibility the primary application will be a virtual machine server that manages reconfigurable hardware. The operating system seeks to minimize some cost function while executing some set of processes. This cost function should be based on an economic model of computation concerned with metrics such as power consumption or execution time. If real estate can be used to save energy and time then it has an implied value.

The system is managed by a network of agents. Agents are capable of receiving and responding to signals and objects. Signals and objects are related just as energy and mass are related--sometimes it is useful to look at things as waves, and other times as particles. The agents communicate with one another subject to the constraints of some physical environment just as the propagation of electromagnetic waves is constained by the physical hardware.

To understand the interactions of the network of agents with their environment, it is important to have a model of an environment. An environment consists of a set of state variables, a set of accessors, and a set of state evolution functions. State variables are the information we might wish to know about a system for example the temparature at a location, the configuration of a LUT, the contents of a register or the capacitance of a MOSFET. These examples demonstrate that state variables exist with different scope and different domains.

State variables that are static in scope are the physical constraints that may not be altered by the agents of our operating system. For example, the electro-magnetic constant, the length of copper between two transistors in a circuit, the dopant density of silicon, etc. State variables that are in constant scope provide a configuration layer. This configuration layer may be accessed by reconfiguration agents.

It is generally desirable for things to behave predictably so that we can constrain an environment model and adapt to an evironment. However, this does not mean that we may assume absolute determinism; we should provide a means for semi-static and semi-constant variables that permit some randomness. This will provide support for defective or faulty components.

There should be processes that can simulate a particular environment to allow for behavior predictions. There could also be feedback from some of the physical environment to monitor the system and control the system.

Methods for optimizing the cost function include:

process partitioning
resource sharing and recycling
dynamic load balancing
garbage collection
power and frequency scaling
place and route
defragmentation

These processes are "computationally invariant" which implies that they only alter the cost of execution while the system functionality remains the same.

No comments: