Friday, August 29, 2008

Megahard Corp: Open Source EDA as-a-Service

Prescience is a quality of those who act on it:

Building very-large-scale parallel computing structures requires simulation and design-cost analysis whose associated optimizations are MegaHard problems (yes, I'm re-branding NP-Hard). A Software-as-a-Service (SaaS) provider of vendor-neutral simulation, synthesis, mapping, and place-and-route tools could change the electronic design automation industry substantially.

If a supercomputer architecture could perform the MegaHard compiler tasks associated with compiling parallel systems in less time than the current tools (within seconds instead of minutes/hours) many designers would gladly pay for access. If the supercomputer uses distributed algorithms, then many people may offer spare cycles to do parts of your MegaHard jobs. Most developers using such a tool would gladly lend their idle cycles to perform graph permutations and compute cost functions.

So let's motivate and create a business model for the tentatively-named, Megahard Corp.

Megahard Corp. (pronounced "Mega Hardcore") has the opposite business model of Microsoft Corp. Instead of selling closed source software and tools as licensed applications, Megahard provides open-source hardware IP and development tools as a service.

The writing is on the wall: we are transitioning from personal-computing, single-core desktop world to a shared-computing, multicore/FPGA embedded world. Yet the complexity of parallel system designs and the computational strain on tools to compile them is growing faster than the performance of a practical desktop computer. The decision for EDA firms to adopt the Application Service Provider (ASP) model probably made sense at some point in the last millennium: design assumptions are much different when you are connected to a massive parallel computer. Because current tools take much, much longer to compile than the file transfer time, there is potential to improve the compiler market by providing electronic design automation tools online.

So here's the business plan: build a supercomputer and code it to run a bunch of FPGA/multicore compilers and simulators really well. Since it's a supercomputer it should make the problem I/O bound. We can tell people we've got this supercomputer that will take in source and spit out configurations in about as long as it takes for you to upload the diff source and download the diff configuration. Since we have a really big supercomputer with lots of hardware all over the place, you can also target your application to some of our bored metal and provide your own service under our framework. (Sassafras is a good name for a SaaS Framework and can have the leaf-shaped logo)

Since we're thoroughly indoctrinated, we'll build our system as Free software and use all the existing open source compiler architecture too. Being open source means our tool won't be made obsolete by an eventual open source standard (read that again so you get it). Open source frameworks allow capable users to contribute to the product development: and wouldn't you know it, the entire EDA market happens to have degrees in computer science, electrical engineering, mathematics and physics.

There's also that recurring comp.arch.fpga thread complaining about the current tools, the lack of open source tools, and the lack of documentation of the configuration format and interface -- someone must appeal to these people because they have a real point: technology is inhibited by monopolized knowledge. It's harder for features to improve when they are part of closed-source packages because you are automatically limiting the number of developers that can improve it (this situation worsens as you lay off more developers).

Another benefit of using a SaaS tool: your users don't have to waste time upgrading. The only way to convince anyone to wait through a software upgrade process is by announcing new features that claim to make the product better: that's why users wait till the second iteration of a major release. SaaS providers can roll-out and test the waters with new features with much higher iteration rate.

When I use the term open source, I mean "free" like freedom, but it's important to point out that previous forays into open source FPGA tools have failed and disappeared because they were merely "open source" and not "free." When I was just starting to care about these sorts of things, GOSPL was being hyped up and eventually was canceled. The code is no where to be found because access to the source was restricted by invite-only membership for some nonsense reason: membership exclusivity maintains the integrity of the organization. When giant corporations disingenuously uses "open source" as a marketing term for non-free software, the project is destined/deserves to fail as long as the so-called "open" source is never actually provided to freedom-loving eyes.

On the other hand, free software, like debits or free hardware IP like opencores won't stop being useful just because some organization stops funding it. Free-software projects never die, they just go unmaintained.

Besides, Megahard Corp. follows a SaaS model, so the profitability is from developing and maintaining a parallel supercomputer system to run the free compiler software, instead of distributing that software. Good free software projects encourage users to "help make it better" instead of "help make us richer," though improving products is a good way to increase sales. A sufficiently better compilation service is probably more profitable than selling licenses too -- I would certainly pay more than what I'm currently paying for proprietary software licenses if I could get access to a supercomputer for FPGA compilation that does it faster.

One major problem: If you aren't Altera, Xilinx, Lattice, Actel, Achronix, Tilera, IBM, AMD, Intel etc. then making low-level Multicore and FPGA tools requires a whole bunch of reverse-engineering and "MegaHard Work" (TM). It's probably technically easier to design an FPGA from the bottom up than to reverse engineer the architecture of an existing chip and build a new tool-chain.

Open-source compilers and reverse-engineered, vendor-neutral FPGA tools have both been successful endeavors in the past. Providing Hardware-as-a-Service is still a "cloudy" future, but there are plenty of recent examples where a vendor came into a market ripe for SaaS-ification and redefined the market. I expect that the MegaHard problems associated with design automation make it ripe for a SaaS provider to change the game.

----
(Edit 9/2) If you find positions contrary to the Open Source EDA SaaS argument, please share them with me. Here's an old interview with Grant Martin, Chief Scientist of Tensilica, who argues that we should not hold our breath for Open Source EDA and specifically says:

I think the other issue with EDA is in terms of the general number of users. I don't think there's a large enough number of users in any particular sub-category of tools to really support very much open source development. Open source requires a big community involvement, plus ancillary things being built around that environment to attract companies into the effort.

EDA is still tool small to make open source the general model. The idea that all tools will be open source, and that all EDA companies would evolve to a service business model, is unlikely and makes no sense.

....

The business incentives just aren't there to motivate those efforts in an open source environment.

I tend to not trust the beliefs of people who were born before the Internet (aka the over-30 crowd). I think he's missing the perhaps non-obvious point that a service business model can subvert the traditional software business model by offering a faster software service with smoother new-feature roll-outs (perhaps when he says "service business model" he thinks RedHat instead of Google Apps). I also know from my prior immersion at CSAIL, that open source software development does NOT require users or a big community to be involved in development, but only requires one indefatigable iconoclast to chip away at status quo for his own personal reasons that are often incongruous with profit motive. When multiple hackers create disjoint useful tools that converge into a single product, user communities start to form to propel that framework further and potentially crack the existing market open.

I wonder how anyone can reject the service business model future for basically ANY computer tool. Seeing such a future no longer requires prescience, just patience. FPGA accelerated computing will increase the number of EDA users, but none of these users will ever identify with the existing EDA market (there's far too much FUD associated with "hardware" development that they traditionally sell: programmable arrays are more than reconfigurable HDL executers).
(end edit)

Sunday, August 24, 2008

Threads Considered Harmful (for the same reason as Goto)

A month ago, Charles Leiserson wrote a post on the Multicore Blog at Cilk Arts called "The Folly of DIY Multithreading." He provides pthreads and Cilk implementations of a parallel fib() function and offers great advice in his article: "Building a concurrency platform from scratch is a mountain to climb." Professor Leiserson mentioned the Therac-25 radiation therapy machine in his post. In 6.033, every MIT CS student learns about the how thread interlocking can lead to confuddling system errors that can kill people. Clearly threads should be considered harmful. Professor Leiserson argues that using a multithreading platform like Cilk will help you address these harmful side-effects of multithreading.

Two years ago, Edward Lee at Berkeley wrote an excellent paper called "The Problem With Threads." In his paper he emphasizes that determinism and composability should be the two major objectives when programming concurrent systems. He describes patterns like MapReduce as "coordination languages" and believes that this model is a fruitful route for parallel programming. He arguest that systems that attempt to simplify multithreading "merely chip away at the unnecessarily enormous nondeterminism of the threading model" and that the threading model is "intrinsically intractable." The final paragraph of his essay serves as a manifesto for those of us designing non-threaded frameworks for parallel computing:
If we expect concurrent programming to be mainstream, and if we demand reliability and predictability from programs, then we must discard threads as a programming model. Concurrent programming models can be constructed that are much more predictable and understandable than threads. They are based on a very simple principle: deterministic ends should be accomplished with deterministic means. Nondeterminism should be judiciously and carefully introduced where needed, and should be explicit in programs. This principle seems obvious, yet it is not accomplished by threads. Threads must be relegated to the engine room of computing, to be suffered only by expert technology providers.
Threads aren't just harmful because of non-determinism and interlocking issues. Threads are harmful for the same reason Edsger Dijkstra argued Goto is harmful. It's the same reason John Backus apologized for Fortran in his Turing award lecture. The serial instruction-stream control-flow style of traditional computer programming is the wrong way to approach most electronic data processing systems!

The main point of this blog is to argue that any-kind-of-threading is a folly (even single threading) and to put forward the spreadsheet as a better hardware primitive. While hardware realities are forcing the deprecation of the single-threaded instruction stream primitive model, we should deprecate the rest of the von Nuemann model while we're at it. We should avoid the perils of multithreading and transition to a dataflow-centric primitive computing model premised around arrays of reconfigurable dataflow units. Locality constrained dataflow is how electrical engineers design physically realizable systems and programmers will have to follow suit to design concurrent code. As we scale to ever more cores, multi-CPUs will start to resemble FPGAs and addressing the inter-process communication issues will require a more physical dataflow view of the system. (Electrical Engineers don't believe in magic. Computer Scientists don't believe in physics.)

Indeed, part of the folly of DIY dataflow graph (DFG) engines is writing your own multithreaded code to run DFGs on multicore and single threaded executers -- not to discount the NP-Hard problems related to heterogeneous partitioning and FPGA compilation. Let your dataflow framework provider worry about using multithreading platforms to run your DFGs so you can just keep using spreadsheets.

Tuesday, August 12, 2008

MIT Students Use FPGAs to Hack Boston T

(edit 8/19) The judge has lifted the gag order today. (end edit)

/. points to an article about a court issuing a temporary restraining order to block a Defcon presentation by a group of MIT students that hacked the Boston T Charlie Card system. Not only can they print their own magnetic-striped cards with up to $655.36, but they can also crack the RFID cards using now-widely-known NXP MiFare vulnerabity and a "WarCart" they built (USENIX paper: Reverse-Engineering a Cryptographic RFID Tag). These students started this work for 6.857 taught by Ron Rivest. Two of them took 6.111 when I TA'ed: they attempted to build a CDMA Control Channel Traffic Analyzer for their final project. Maybe we should only allow projects that could end up getting us into lawsuits!

The Tech website has the presentation slides with all the legal documents too. The MBTA's strongest argument is based on the premise that it was unclear whether the MIT students were going to present any new vulnerabilities that were not already in the materials they had sent. Such materials could be a potential threat to public safety. The claim posits that Professor Rivest would not have given them an "A" for the project if their work was just magstripe printing and a repeat of the MiFare hack. It was pretty clear (to me) from the evidence that they were only presenting vulnerabilities in the CharlieCard system that they explained in their now-public disclosure to the MBTA as well as demonstrating how poorly accountable the actual security system is currently.

From a free-speech standpoint I think the temporary restraining order may have been appropriate to require the students to explain their project to the court in case they were going to present anything that could present a safety hazard to the subway system. As far as a non-expert Judge is concerned "someone is presenting novel vulnerabilities in the Boston Subway system" should be enough to merit the fullest caution of the court. However, as soon as it is clear to the Judge that the financial interests of the MBTA (and not public safety) are being weighed against the free-speech rights of the MIT students, the TRO should be lifted. The result of this case will probably be the same as the Dutch NXP Mifare case and an appeals court should lift the restraining order in favor of first amendment rights.

(Edit 8/14): Some of the terms of the TRO are ridiculous from a free-speech standpoint -- claim (2) would imply that security is a property established by judicial process. It's self-contradictory to include that in the request. Apparently the Judge denied a reasonable appeal from the EFF to add the terms "non-public" from the TRO. Apparently this Judge wants a higher court to clarify that first amendment rights outweigh the financial interests of the MBTA.

The Boston Herald reports that an MBTA board member Janice Loux is calling for a security audit and has lost confidence in the General Manager David Grabauskas . Here's a quote from Grabauskas that contradicts the courts action today to deny the appeal: “We just want to make very clear that we’re not interested in quashing anyone’s First Amendment rights,” Grabauskas said, declining to respond to Loux’s criticisms. “All we’re really asking is that the court bar the release of any nonpublic information that might adversely affect the CharlieCard’s security.”
(end edit)

That said, I think the students probably should have presented the CharlieCard vulnerability anyway since they were definitely not going to present anything that an expert couldn't figure out from the data that is out now -- and with so much more attention because of the injunction. Apparently the MBTA also isn't familiar with the Streisand effect. This aspect only highlights a common management problem and my new favorite aphorism:

Do not trust anyone over 30: they did not grow up with the Internet

Never hire someone for a security position unless they have a solid understanding of memetic diffusion in social networks.

As for the problem of finding qualified security consultants to audit the system, I think the students just volunteered to do it*: considering they have pictures in their presentation documenting how they committed various types of fraud it seems like a fitting punishment to make them fix Skynet. I guarantee that they are capable of building a more secure CharlieCard system on the cheap (like 1/100th of the $180M spent on this already). A short term software-only fix could probably achieve much higher security within weeks. The system should have a built-in RMS notice to encourage card swapping "to limit the amount of information Big Brother [Little Brother] can collect about T riders..." (RMS sends a message like this to the CSAIL list to setup card swapping sessions every so often).

The MIT students have a great slide in their presentation: "Why Brute Force with an FPGA? ... Because it's Fast!" I think it's probably better to make this pitch at the High Performance on Wall Street conference over Defcon. From their slides it seems like they created an FPGA-accelerated 48-bit RFID key cracker using a Python script to generate Verilog. Now they, like h1kari from Pico Computing, are giving presentations to other hackers about the merits of FPGA-accelerated cracking (video of h1kari demoing FPGA accelerated WEP cracking). I remember an all nighter at the Las Vegas airport after Defcon in '06 discussing the state of FPGA programming tools with h1kari... that conference made me realize that the existence of vendor-neutral open-source FPGA tools would be the catalyst to change the perception that FPGAs are difficult to program. It seems like reconfigurable computing has multiple fathers and a whole slew of iconoclastic sons trying desperately to get on FBI watch-lists. The FBI probably doesn't suck at this FPGA-accelerated cracking stuff either, but they find more use for FPGAs in their data-warehouses as I'm sure my readers won't confirm.

To understand how the RFID hack was performed, you should read the Nohl paper from UVA which describes how the physical circuitry of a Mifare Classic RFID was reverse engineered. I think the technology used to determine the digital logic on the chips is more interesting than the rest of the paper about the RFID security. Once they determined the cipher for the Mifare cards, they implemented a naive brute force cracker in 64 Virtex-5 LX50 FPGAs that takes 50 minutes. Then they also reveal a weakness in the random number generator that allows them to essentially eliminate the randomness. They also describe a method of generating codebooks to substantially accelerate the crack.

From the slides of their presentation, I believe the MIT students transformed the naive brute-force multivariate quadratic (MQ) cipher cracking problem to a more efficiently implemented SAT problem (which would certainly merit an "A" for the work). FPGAs are perfect for brute-forcing SAT and there are probably 1,000 papers on this topic since it is possibly the most obvious target for FPGA acceleration. From the slides, it seems like the students replicate their cores with a Python script and probably do not contribute anything interesting to accelerating SAT except a novel way to profit from it.

From the comments on this year-old hackaday thread about converting HDTV boards into crypto-crackers, FPGA-accelerated cracking seems to be a good way to get people excited about FPGA technology. It also seems the problem of explaining "what is an FPGA" is still a hurdle to widespread adoption. This is where I say that an FPGA is a hardware spreadsheet, and we call it "Accelerated Computing" to proliferate our favorite meme to replace any other nerd-word you might put in front of "Computing." I also built a spreadsheet to hack the 48-bit RFID keys; it took less than 10 minutes to design and compile to Verilog, but I'm actually afraid to post it now! Here's the anatomy of a massively parallel brute-force cracker: each row tests a unique range of keys and iterates through them until it finds a match, columns of the sheet are "start-key" "end-key" "current-key" "next-key" "hashed-output" and "satisfies target." Super-clever implementers will propagate constants to optimize the hashing function based on the (start, end) range for each row.

Coming Soon: A collection of "Threads Considered Harmful" articles about the dangers of programming in parallel control flow style and the merits of dataflow design for easy-to-grok parallelism.

* The second-most important thing I learned at MIT was that things aren't done because you haven't done them. The most important thing I extracted from MIT was an observation about the effectiveness of people at completing a task that Arlow and I made while running house-cleanups in our fraternity. This observation was extremely apparent as a lab assistant watching students learn to develop FPGA applications in teams. We learned that the metric for a person's effectiveness is the product of their motivation and skill. With rare exception the overall effectiveness of grouping people working tightly together is the product of their individual contributions. Skill can exist anywhere in the complex plane. This model explains things like Microsoft, Google, Congress, and many other organizations down to the T. Imaginary skill leads to "security systems" meaning something different with the quotes. Motivated negative skill leads to War (and WarXing my EZ-Pass for profit).