Tuesday, November 05, 2013

The Stanford Startup and the MIT Startup

Message from a Jedi to a Young Padawan
When I graduated and was considering pursuing startups, an alum from my fraternity gave me some advice. He was a successful entrepreneur and sent me a message about pursuing technology-oriented startups. He presented a maxim about an MIT company and a Stanford company building products for the same market. The Stanford company gets a product out quickly, they make money, iterate and then raise money. They use network effects to lock-in customers or viral growth tactics to get super-linear returns on marketing investment. The MIT company seeks to develop an unassailable technical advantage, optimizing their product or process in terms of kilojoules, units per second, and dollars. They either find a market-fit or sell their technology to the Stanford company.

The dichotomy is between a focus on technology development and a focus on market development.

Let me present an instance of this: two startups are selling environmentally-friendly ammonia (a real and big problem).

The Pitch
The MIT company: "Our unique chemical process allows us to produce ammonia with no environmental impact for 10 percent less cost than competitors. We can modify our catalytic nanoparticle process for the production of perchlorates and sulfates, and dominate the industrial chemical supply industry."

The Stanford company: "We sell premium household cleaning supplies and fertilizer that are produced sustainably and good for the environment. We sell in stores and offer a monthly subscription model; receiving a package will remind you to clean up your house and water your flowers."

Sales and Marketing
The MIT startup has no sales to customers, but possibly a DARPA grant to develop their technology. The team has 9 PhDs and just hired an MBA to start finding customers. They believe their technical advantage using solar-powered nano-crystalline catalysts will enable them to lower the cost of production of commodity chemicals and therefore dominate the market. Their customers will be the major fertilizer, pharmaceutical and consumer product companies. Google for the company name and you will find a landing page. They are still "in stealth mode" while they finish up some R&D and production optimizations for their nano-particle production. Team MIT needs funding to develop a manufacturing facility (and to survive as a company). Their vision for sales and distribution involves hundreds of payments for tens of millions of dollars each year for shipment sizes that look like something out of The Wire or Breaking Bad.

The Stanford startup has developed no new technology but has already validated its customer model selling sustainable branded cleaners and fertilizers at a local Whole Foods and Home Depot. Costs for sustainably produced chemicals are higher, but the founders maxed out their credit card buying a wholesale shipment and were able to sell a premium retail product at a small profit. They setup stands at farmers' markets to sign people up for monthly packages of cleaning supplies and plant food. After testing their market hypothesis, they decided to focus on cleaning products and limit marketing for the fertilizer product because that strategy generated more recurring revenue for less cost. Attrition rate for the cleaning product shipments is lower than growth and there are customers posting on the internet about how much they "appreciate the hand-signed note thanking them for supporting their mission to spread sustainable production." They have thousands of monthly customers, they know their cost per customer acquisition and they know their average revenue per customer. Team Stanford think they could get millions of customers to pay them $9 a month for their product; which includes rags in addition to ammonia and bleach. They are still tracking a growing market niche for sustainable home food growing systems including plant food and seeds. They think viral marketing strategies will help them reduce their customer acquisition costs so they want funding to expand logistics and distribution in other regions and try some other growth strategies like advertisements, and letting people choose scents in-stores before placing an order.

Investor Response
The outcome for either of these companies is non-obvious. The MIT company claims to have the successor to the Haber-Bosch process, a chemical process technology that won its inventors Nobel prizes and was the foundation for what was once the world's largest chemical supplier. However, they want to enter an established commodity market and need to prove that they can scale sales from zero. Investors will need to vet the technology before they can fund the company. Investors look for "order of magnitude better" when vetting technology companies to determine if the technology is defensible. Very few investors will have an understanding of the chemical supply market and fewer still will understand the founders' PhD work optimizing production of ammonia using nano-particle colloids. They will also need a lot of funding before they can serve this market.

On the other hand, the Stanford startup has traction in a market and will likely have a much easier time raising funding. Investors will understand their consumer market and they won't require technical vetting. It is unclear if their market position is defensible.  Someone else can replicate what they do especially since there are no network effects where the total value of the product increases with more users thus creating market lock-in. However, they don't need much funding to grow their sales and they are looking to scale from a solid profitable foundation, which decreases the perceived risk to investors.

The MIT startup could potentially be a $100B company in the chemical supply market. However, the Stanford startup can be reasonably valued at $10M today based on traction and will get term sheets from many investors. The MIT startup is much more speculative today and needs to find a wealthy individual to bankroll their first factory or take strategic investment from large potential customers.

There are many companies that fit both of these patterns and end up successful. The successful technology startups eventually develop a market approach. A lot of founders pivot from developing hi-tech to do entirely different market-focused ventures. Some founders have taken both approaches in separate companies and been successful at both. Conventional wisdom suggests the best startups develop technology and a market simultaneously. Many startups can operate with just a telephone and a spreadsheet on day one and then use technology to automate their operations. Technology is not a prerequisite for business success, but marketing is.

Sunday, October 28, 2012

Fear and Loathing at Zigfu: My YCombinator Experience

Applications for YCombinator's winter 2013 cycle are due October 30. Let me tell you about my experience in YCombinator with Zigfu.

Zigfu is a platform for making motion controlled applications using the Kinect in Unity3D and HTML. We incorporated as Motion Arcade Inc. in May 2011 and went through the YCombinator program in Summer 2011. During my interview with YCombinator, I pitched a company that would develop an eco-system of motion controlled applications, and monetize by selling applications, dev-kits and a motion OS with an app store for smart TVs. I showed off one of my Kinect hacks and had PG dance the hokey-pokey. We got in.

Today, I actually make reasonable revenue from selling the Zigfu Dev Kit. We are a little better than ramen profitable with over one-hundred customers using our Unity3D software for interactive installations all around the world. Over 100,000 Kinects have been hacked with our developer package, and thousands of developers are using our platform actively in commercial and university projects. But, we are not one of YC's high-flying success stories. I'm the only person still employed full-time at Zigfu supporting developers and working on customer projects. The platform is still gaining new features every time I do another project, but this was not the business I intended to start. We never raised money after demo day, not for lack of trying. It is safe to say that the most awesome thing that I got out of demo day was a cigarette bummed off Ashton Kutcher. Sweet.

Here's our story.

Fear and Loathing at Zigfu:

I want to start by sharing some wisdom gained from this whole experience. If you're working in a rapidly evolving technology field, you might be too early to scale your solution and your solution might be wrong, and you'll always have some uncertainty and doubts about the potential for success and thus question the value of any particular schlep. We called this "Fear and Loathing" at Zigfu.

A lot of investors asked me "why this?" like why pursue this company or business over any other potential way to spend my time. This is a common interviewing technique akin to "why do you want this job?" or "what in your experience makes you qualified to do this job?" This means investors want to hear why you are working on the problem you are solving. A lot of startups can canonically answer it with a rehearsed anecdotal response like:

"We experienced problem X, and think that there's room in the market for the AirBNB/Dropbox of X" for a B2C company.

Or "While getting paid by company Y, we ran into problem Z, and think that there's room in the market to be the Salesforce of Z for all people in companies like Y." for a B2B company.

For Zigfu: "I was developing motion controlled apps for interactive engagements and needed a set of UI components for motion control, and I think there's room in the market to be the iOS/Android of Kinect." This often led to "but why not Apple or Google or Microsoft?"

If there's a flicker of doubt, investors can smell the lack of confidence. You are not a true believer and so why should they be? But this question brings up the basic existential crisis that faces most geniuses: how do you address the opportunity cost of focusing on one thing over any other you could be pursuing? It is important and difficult to remain in control of your doubt despite that nagging voice that tells you could be doing something totally better, like building an FPGA OS, or creating radios using light-wave vortices, or solving the Riemann Hypothesis.

Focus is hard, and requires discipline and feedback mechanisms. Establish good practices for setting and meeting milestones, and manage uncertainty and doubt early in your startup. When you have a large set of tasks to accomplish and fear and loathing dominates your actions, you end up choosing what not to do instead of what to do. Our friend Cody says "there's no room for post-modernism in startups." Get over your existential crisis and get to work.

The Problem:

Our startup was founded on the premise that human motion sensing is going to be a commodity. Today, to perform motion tracking you need to buy sensors for a hundred dollars or so, paired with a processor for another few hundred dollars. The future has high-pixel-count sensors integrated (probably vertically) with a computer vision processor providing tracking information with high accuracy, low power, and at the same cost as existing camera sensors. Zigfu aimed to solve the big challenge for bringing such integrated sensors and embedded computer vision systems to market: natural user interface (NUI) design and the creation of a motion controlled application eco-system.

If you are looking for an idea for a startup, you can append "...that doesn't suck" to a lot of existing product descriptions to define good problems and it helps to have a product description you can summarize in a few words. Zigfu is making a voice and gesture controlled user experience that doesn't suck. Even with arbitrarily accurate tracking information about the positions of every joint in the human body, there is still a major challenge of making a comfortable and intuitive gesture-controlled list, for example a list of 1000 videos, where you can quickly select exactly the one you want to watch, without false positives and without misses. The human factors engineering problem in natural user interface design is like building the Cocoa framework that powers the iOS UI, except instead of touch screens, we're working with Kinect and other hand tracking sensors, and instead of phones, we were thinking about 10-foot-experiences like for televisions. We imagined advertisements for our product would show the slow-motion destruction of remote controls using implements like a chainsaw, sledge-hammer, or nail gun, with pieces of circuit boards and shards of plastic flying everywhere and Ode to Joy playing in the background.

Good ideas are non-obvious and misunderstood, otherwise you might be too late or under-capitalized to win the market. The NUI problem is easily misunderstood because it might seem like part of the evolution of the sensors. We should not be deriving the desired user-experience from the limitations of current sensor technology or the availability of software algorithms. Instead, we should work on the user-experience to derive the necessary characteristics of the sensors and algorithms.  It is easy to differentiate motion control companies based on measurable quantities of a sensor or computer vision stack: more joints tracked, higher accuracy tracking, higher frame-rate, longer range, lower cost etc, but these systems are judged by the user experience they provide. Now that people have experienced Kinect, they might say: why not just make a cursor with hover-to-click? Clever engineers will suggest virtual touch-screens, virtual mouse cursor systems with push-to-click, and augmented reality controls (overlaying the UI on the camera image). We've tried all of them too, but slapping motion input onto existing user interfaces designed for different controllers is a recipe for a crappy user experience.

As our starting point, Zigfu helps to simplify gesture controlled application development with Kinect. We do this by making it easy for developers to install and use Kinect with HTML and Unity3D, and by providing high level motion-controlled user-interface components like buttons, menus and lists. Our stack is built on top of existing sensors and computer vision middle-wares and has been easy to port between multiple languages and sensors and tracking algorithms. Investors who understood this message, would say we are rate-limiting our growth by our dependency on the sensor market. Indeed the diversity of available sensor and computer vision products on the market has been growing slower than we expected.

Surely Apple should have a voice and motion-controlled TV set on the market by now.

Starting Up:

I started Motion Arcade Inc. because I was supporting Kinect hackers all over the world with a package for Kinect in Unity3D. I had just wound down my first startup "Tinker Heavy Industries" making iPad apps for kids (try our ragdoll elephant) and was finding contracting gigs to make money hacking the Kinect, and I wanted to start hiring a team to support the motion gaming eco-system with tools and an app-store. I had been working with Shlomo Zippel, who was an applications engineer at PrimeSense at the time, on the Unity3D package for Kinect hackers. For background, PrimeSense is an Israeli company that makes the Kinect's sensor and they are also responsible for fueling the hacker community with the release of OpenNI/NITE in December 2010. OpenNI provides a framework for natural interaction software development and NITE is a set of free commercial algorithms for skeleton tracking.

In order to do a startup, you will need co-founders. It's important to find complimentary co-founders if you're the lead instigator and solo-founder seeking co-conspirators. You will understand this better when your company is making consumer-facing products without a designer or trying to sell OEM licenses without someone with this sort of sales experience. I had spent about 6 months seeking co-founders for Motion Arcade and getting into YCombinator definitely helped me pull a team together. I had met my friend Ted Blackman through mutual friends from MIT, where we both studied, and we applied to YCombinator together. Ted and I are kindred spirits: we're both the kind who can rapidly become an expert in any science, technology, engineering or mathematical field. I spent only one day hacking with Ted, but it was clear that he was an exceptional genius. YCombinator was concerned that I was essentially a single-founder pulling in Ted without much experience working together: One indicator for YCombinator about the likelihood of success of a team is that the cofounders have known eachother and worked together for a while and would work together on anything: basically if the teams' loyalty is to the group and not the particular idea they are pursuing they will work together through multiple pivots. Of course, this isn't an absolute indicator and Dropbox is a notable exception of a solo founder recruiting a great team.

After getting accepted into YC we started out with the goal of making an Xbox game. We used our YCombinator-ness to get Microsoft to send us an Xbox Development Kit; this was no simple feat. Our experience working with Microsoft probably served as a prototypical example for how they would later interface with the startups in the Techstars/Microsoft Kinect accelerator. We figured Microsoft controlled the path to market for any motion controlled consumer product and that if we wanted to spawn our own Kinect app-store we would have to play inside the Xbox eco-system first. We needed quality content and we may as well eat our own dog-food if we're making a platform: make a game and sell it before trying to make our own distribution channel. Valve didn't just start with Steam, they had Half Life first. I hired some of the artists that I had worked with on Tinker Heavy Industries to produce 3-D content and we set out to make a game called Sushi Warrior.

A month after starting YCombinator with an extra $150,000 from SV Angel and Start Fund and $50K more from Romulus Capital, we recruited Shlomo Zippel out of PrimeSense to join us. Shlomo convinced his friend and coworker from PrimeSense, Roee Shenberg, to move here from Israel to join the founding team at Zigfu. Shlomo and Roee are both amazing hackers and work together extremely well. Israeli hackers are trained better than most MIT students, and it was a privilege to hack with them.

With this team, and their experience we shifted focus away from full-body games to the more difficult problems of making hand-gesture controlled UI and apps. We had a ton of users downloading our software for hacking Kinect, and our community group and subscriber list was growing rapidly, we wanted to make some applications that people would use regularly. In about 3 months we cranked out demos of a gesture-controlled YouTube and Facebook and an app loader/portal. This was what I was showing investors at the Microsoft VC summit in October 2011:

After Demo Day and Burning Man, in September 2011, I switched into pitch-mode telling investors how we had a growing community of Kinect hackers using our software and we were building the Zig TV motion OS, looking for funding to make an OEM-license-able product for smart TVs.


In retrospect, I wasted an incredible amount of valuable time showing these demos to investors trying to raise a seed round. I met with Paul Graham shortly after demo day in September 2011 to discuss the process of raising money and we talked about investor leads and interest. I told him my goal was to raise $2.5M in a month or two and he said pretty directly: "no way that's happening." I thought we had a strong demo day launch, with a lot of leads to follow, and many of the articles labeling us one of the startups to watch in the YC summer 2011 class. I guess it's easy to stand out in a YC class when what you are showing is not a web or mobile app.

Unfortunately, because our demo is compelling, and our technology is cool, I wasted an incredible amount of time piquing interest without closing. I started meeting with VC investors and was enthusiastic early on when Andreesen-Horowitz committed to join our seed round contingent on us finding a lead investor. Other investors suggested they would also like to join a round and wanted to track us for a month or so to see our traction. This herd-mentality is a common strategy among investors. The best investors will reject you quickly if they are not interested. The last investor I pursued was Brad Feld from Foundry Group early in 2012. Brad is a great guy, and Foundry is a great team: they won't say things like "we'll invest only if someone else will." Brad is kind to entrepreneurs, notably so for saying no in 60 seconds and not wasting your time. They also invest in natural-user-interface companies, notably Oblong, who is the team responsible for making the "Minority Report" UI. I show off Zigfu a lot and I demand a dollar every time someone says "Minority Report" to me.

Shlomo and I flew down to Vegas for CES mainly to meet Brad. Because he didn't say no in 60 seconds, we were pretty excited; it would take about a month before we got to a no. The TV OEM market we were aiming for wasn't attractive to Foundry, and ultimately Brad just got the feeling that while he was excited by the team and tech, he was trending negative on Zigfu's business prospects. Brad said something like "This sounds like something that will sell to Google or Microsoft for $20-50M and I'm just not interested in that." I'm really grateful for having had the opportunity to get his feedback.

So after several months of chasing down VCs getting weak commitments to join a round; only once someone else would lead, I'd had enough. I realized that I should have been working on getting influential private angel investors on board first before talking to venture capital firms, but mostly I was over the whole fundraising thing entirely. I met with Paul Graham sometime in the middle of this failed fundraising process and he said to me "well maybe you just suck, Amir" and encouraged me to "just make money." Shortly after that interaction he published this essay about patterns in the least successful companies he's funded. I'm sure he was writing this to me, at least partially.

Just Make Money:

Actually, I should not have spent any time seeking out funding at all. We still had 6 months of runway after demo-day when I set out to fund-raise and there was no shortage of Kinect hacking jobs available to make quick revenue while building our platform. We were turning away revenue because it was distracting us from building our platform and fundraising, but we could have better spent our time building products and getting a stronger revenue strategy together before raising capital. Revenue is just like funding but it costs you less to get it, and building products isn't a total waste of time even if they don't succeed at massive growth. Investors care about your vision for the company and product, but for crazy ideas that don't fit in the common startup bins, you need big potential revenue numbers and evidence of traction.

It wasn't until March 2012 that we started selling a Kinect software-development-kit product for Unity3D with a buy button up on the internet. In retrospect, we could have been selling this for nearly a year by the time we finally got around to it. We were basically out of funding, scraping by each month doing some amount of contract Kinect hacking. By April it was clear that the dev kit revenue and contracting revenue were not going to sustain us as a team, and we didn't have any runway left to build new products. Ted was the first to leave at the end of April. Then in July, Roee and Shlomo took on contracting gigs separate from Zigfu, and built TheFundersClub, which recently announced $6M in funding. Shlomo is developing TheFundersClub now, and Roee has moved back to Israel to pursue natural language processing.

I continue to operate Zigfu, updating the platform to support new features and supporting customers. Licensing revenues are reasonable: I cover all of my personal expenses from our mostly passive online sales revenue. I'm supplementing that income with contracting revenue doing Kinect hacking for interactive installations, kiosks, and digital signage engagements. These developments have led to OEM licensing agreements with a few large companies who use our browser plugin for Kinect controlled digital signage. I plan to release additional products from some of our half-completed projects and to support new sensors and tracking algorithms as they are released. Eventually, I plan to hire someone in a marketing and business development role as soon as I've saved up enough of our revenues; I am looking for the right person.

Gained Wisdom:

Shlomo and I recently reflected on things we would have done differently and think this is worth sharing as advice. First, don't try to raise money if you don't have revenue or at least a credible story about where your paycheck will come from after the funding runs out. OEM licensing is not a good story for a seed-stage startup without OEM sales experience, though I'm getting better at it. We developed a reasonable digital signage and interactive installation niche for Zigfu, and it's totally awesome supporting a platform used by tens-of-thousands of developers. But the promise of the gesture control market is still out-of-reach for many consumer applications. The bill-of-materials needs to get to $1 for a commodity CMOS array with skeleton-tracking logic integrated in the sensor.

We should have been focused on the developer tools market earlier since it was accessible to us, and then we could show some kind of traction to investors. Don't over-think it. As technologists, we wanted to build an important and innovative platform like a motion controlled TV operating system, but you can make money quicker and start growing revenue by building something simple and targeted, like a gesture controlled slideshow or gesture interactive digital signage.

We should have been more frugal with the funding we raised. I told PG and the YC partners that I thought that the $150K convertible debt that comes when you enter YC might have broken their scrappy startup model. Believing that the most leverage I could gain from that funding by demo day was to ramp up and hire people to make stuff, I spent it on getting more people on my team without proper focus on a product. But during the phase where you are dependent on funding and have no revenue, if you have more people on your payroll, you will run out of money sooner.

An early stage startup really shouldn't spend money unless it's clear how that investment will produce more money. It helps if you and your co-founders have saved up enough money before you started so that you don't need a salary, and we all agreed that we would have significantly less anxiety about funding if we had longer "personal runways." If you can sustain on savings, then you set timelines and milestones without fear of running out of money to pay rent, so you can use the funding you raise strategically instead of needing to spend your funding on personal expenses.

I really wish we had thrown more parties. The Kinect hacking community is building cool toys: one of our first customers turns DJs into Robots on massive displays and our friends at Ethno Tekh make awesome body-controlled musical instruments. We should invite our friends to play with these toys. It gives more social context to our products and it is very motivating to see people appreciating our work. Brad Feld told me that a startup isn't a waste of time if you learn something and make friends and I certainly learned a lot from doing Zigfu and made a lot of brilliant friends through YCombinator. If you are irrationally confident in your ability to achieve greatness, I encourage you to apply.

Friday, October 05, 2012

Zigfu HCI, and Terminator: Ad-hoc Skeleton Tracking

I haven't had a brain dump in a while, and I should probably write more since this blog gets a ton of visitors for some reason. I'm excited about a lot of technology that I've read up on lately and will touch on some before diving into object tracking.

I've read a lot of really interesting papers about transmitting information using orbital angular momentum (OAM) of light and a 2.4 Tbps wireless transmission that recently set the record. OAM is a separate dimension onto which we can encode information, providing a multiplicative effect in the amount of available bandwidth for transmitting information. This will be a massively disruptive technology and you should leave this site right now to learn about how to produce curl in your Poyting vectors using helical antennae.

I've also heard about using field-effects to produce gate-controllable junctions in arbitrary singly doped semiconductors. This can lead to new types of solar devices using cheaper materials like Copper Oxide. One could also imagine a photo detector array which uses the effect to sweep through color bands. This method is similar physically to the method of producing a band-gap in bilayer graphene. Controllable electron transport and volume manufacturing of graphene devices are both very active areas of research and any electrical engineer who wants to be ahead of the curve aught to study up on relativistic quantum electrodynamics in monolayers.

On the FPGA side of things, I'm very excited by the Xilinx Virtex-7 which went with 3-D integration to provide what they are calling "Stacked Silicon Interconnect" on a separate layer which shows that FPGAs continue to be a proving-ground for leading semiconductor process technology.

I expect that we will see optical sensors 3-D integrated on top of a processing layer so that we will have computer vision baked onto optical sensors. This will allow us to process visual data with much higher framerates, lower-latency and lower-power. I predict that image filters, motion detection, optical flow, edge detection, generation of pyramid representations and integral images will all be implemented on silicon that is 3-D integrated with a sensor. This sort of sensor will enable low-power object tracking and gesture control in basically any embedded application.

Object tracking and recognition are topics I have been following for a long time. I wrote about my ideas for synthesis-based recognition techniques six years ago when I was still an undergrad. PrimeSense's launch of OpenNI  inspired me to pursue human computer interaction side of this work at Zigfu where we recently hit the milestone of one-hundred commercial customers in 29 countries using our Unity3D SDK to produce interactive experiences. Zigfu is primarily focused on human-computer interaction (HCI) and not computer vision (CV); even with perfectly accurate sensors and computer vision to track human movement, there is a wide open problem of designing a motion-controlled UI that enables a human to comfortably and accurately select one element out of a list of 1000 elements. I like to compare this to touch-screens, where the measurable qualities of capacitive touch screens are like sensors / computer-vision while the magical qualities of the iOS/Cocoa UI and Android UI are how people experience this technology.

Still, I've been keeping an on eye on visual object tracking for a long time and also want to do a brain dump of my thoughts. Visual object tracking such as in Zdenek Kalal's predator algorithm is very compelling:
Zdenek has also supported an open source community called  OpenTLD (Tracking Learning Detection) which has produced C++ ports of his algorithm from the matlab original (see me playing with it in the video below).

Another great reference is this paper on skeleton tracking with surface analysis, with a really cool video to go along: Some time back on April 15th, I wrote a proposal for ad-hoc skeleton analysis to the TLD group that went something like this.

Ever since watching the Predator video, I've been thinking about how to extend the algorithm to use 3-D voxel and point cloud structures and not just track, but determine orientation and perform ad-hoc discovery of skeleton joint structures. I call this algorithm "Terminator." Terminator tracks and recognizes objects by generating a rigged 3-D model of the object. Instead of producing 2-D wavelets from the tracked object as in predator, terminator generates a 3-D model of the object complete with inferred skeleton structures. A voxel model can be created by composting the 2-D images generated by a tracker system (as in Predator). Voxel acquisition can also be assisted with a depth sensor. Recognition and determination of the orientation of rigid bodies can be performed using a sparse voxel representation. One way to accelerate recognition may be to use principle component analysis to align the input data with the models being recognized. Another way to perform recognition may be to create sets of 2-D wavelets by projecting the voxel representation to create sets of 2-D image recognizers  Brute force 3-D convolution of sparse voxels may also work, but makes more sense for detecting features in non-sparse voxels like MRI data. A skeleton model with joint weighs for each voxel can be inferred by using optical flow data to detect state transitions in the voxel model such as when a mouth opens or hand gesture changes. Historic data can be used to verify and improve the skeleton system and to infer range of motion constraints. Just had to share this idea in case I never really get down to it.

Since the Kinect provides superior tracking on the 3-D depth data, we are able to train multiple predator-like algorithms on the RGB data with feedback from a depth-data tracker. We can also use the depth data for user segmentation and extract only the interesting RGB pixels from the registered depth data.

Anyway, this is long enough already, so I'll leave you with a video I made running the predator algorithm's C++ port in multiple processes:

Tuesday, March 20, 2012

Google's Weakness: User Experience

Paul Graham pointed out that someone could take advantage of Google's UX weakness in his article identifying a new search engine as a potentially disrupt-able industry for a highly ambitious founder.Google might not realize this, but they have a huge gaping weakness in User Experience. I have one Facebook account. I have 6 or 7 Google accounts. And they don't work well together. When I am trying to access my adwords, I need to sign in with my amirmotion gmail account using my zigfu.com email address as my login, and then I get this error because I'm still signed in to my catalystac.com account:

So in order to use Adwords, I need to log out of my catalystac.com google account and sign in with the appropriate gmail account, which for adwords, is zigfu.com. Due to my multiple accounts setup with Google, I have the same awful user experience with just about every google app. For blogger and youtube I need to use my amircatalystac gmail account (which is different from my catalystac.com account). I also have no idea how to access the admin panel for my catalytac.com domain, so I periodically can't access the documents that people send me in Google docs. Here's what I get when I try to click "Yes" to a google calendar request for RSVP in anything other than my tinkerheavy.com Google account:
My iPhone provides the only/easiest way for me to aggegate all these obnoxious multiple Google calendar and multiple Google email account issues and provide me with a single amir-centric user experience. Instead of this sort of awesome and consistent user experience, Google offers me MULTIPLE USER ACCOUNT HELL. The other issue with Google is there is no button I can push to send an email to complain to Google about my terrible user experience or to get support with improving it. Larry Page should be reading such an inbox, because this is a huge weakness for Google and they probably don't even realize how bad it stinks.

Thursday, January 26, 2012


I noticed Altera announcing OpenCL support for FPGAs. Here's a paper and some slides on it. I also noticed a blog pop up last week about doing OpenCL on Xilinx. There's also been work on a CUDA->FPGA system called FCUDA from a group from UIUC and UCLA (here's the longer paper on FCUDA). So hardening GPU-designed algorithms is now at least an idea, and possibly a good one. This will enable CMOS array sensors with integrated ASICs performing computer vision.

Monday, October 24, 2011

Making a TV for Steve

The TV space is ripe for disruption, and I keep seeing speculation about an Apple Television. In his biography interviews, Steve Jobs claimed to have "finally cracked" the TV user experience.

Last week I presented my start-up ZigFu at the Microsoft Venture Capital Summit and said that motion-control technology like Kinect will be as disruptive to the TV as the remote-control or possibly even color (Video at the end of this post). We want to make a TV with a motion-controlled user experience that would have made Steve Jobs proud to demo it. A TV designed for Steve would take you no more than a few gestures to watch a movie, or search for a video or song. You'd never be that far away from Netflix, Amazon, eBay, or even ordering a pizza with gestures.

In addition to a world-class TV user experience using gestures, the other major differentiating factor for success in the gesture controlled smart TV race is an app eco-system. It's forming now. You might have noticed the storm of Kinect-hacks coming out of academia or indie studios. You might also have read my other posts that I am running a fresh-out-of-YCombinator startup supporting developers with UI libraries, tools and an app-store for motion-controlled apps.

I may be stirring my own kool-aid, but I feel like I aught to put a stake in the ground now, especially while I'm busy hustling to find investors (contact amir at zigfu.com). At any given time, there's only a few next-big-things in technology and right now motion-control is one of them. Kinect was huge, and from the insider's perspective it looks like there's a tsunami coming that will transform the way we consume and interact with media.

The next wave of computer-user interaction to come after touch will be motion. We will take for granted that computers can see us and react to our movements. Motion will be a defining feature of the smart TV category in much the same way touch was to smart phones. Motion, combined with voice and smartphone control, will be as ubiquitous as the remote control.

The evolution of TV will have apps like Facebook, Skype, and Youtube and of course features like video sharing. These applications will make TV a more social experience (this post is fully buzz-word compliant). Where these apps would kinda suck with a remote, these new apps plus all the existing cool features we already have on TV can be made much better with motion-control. Football fans know what I mean when I say I wish I could play replays with something better than the remote. We want to reimagine the DVR experience with motion-control. Many totally new applications will emerge in this eco-system as well. We are already seeing several groups making applications for trying on clothes in a virtual fitting room and playing motion games that will be bought through an app-store.

The network is mature enough for these applications to emerge. The technology exists so that we can build it all today, but it is not yet accessible to developers. We're still at the beginning of the development of this control paradigm and we haven't settled on best practices for interacting with devices with gestures. Kinect needs something like Cocoa to provide lists and menus. That's what ZigFu is building.

Tracking the nerd news on the inter-tubes, I've noticed an onslaught of press releases about Xbox Live with TV channels and Kinect control, and several other companies with controllers using Wii-like controls. Speculation is abound that some Apple Television product that may eventually be announced with voice-control features like Siri. Maybe Google will play in this space and provide a motion layer to their Google TV offering: with Motorola they acquired a major player in the set-top-box space. We think this eco-system is new and different enough from anything that's come before it. Perhaps our start-up can run with the flag and beat the big guys to the market.

Once it is obviously a differentiating factor in TV sales, all the vendors will embrace the category of motion controlled smart TVs with apps and games. Some combination of a camera and microphone array will be integrated into every television and your smartphone or tablet touchscreen will be able to talk to the TV too to transmit control signals.

After I presented at YCombinator's demo day, someone approached me and said "your demo has Apple written all over it." And that's pretty much what we set out to do, we're making a TV for Steve.

Here's a video of my presentation about the work we're doing at ZigFu:

My demo isn't totally polished yet, but you can see where we're going.

Wednesday, August 24, 2011

ZigFu - Motion Apps

Well, the cat's out of the bag. We've been doing YCombinator this summer developing a motion apps company we're calling ZigFu. Something's gotta pay for all the lasers...

Today was the YC demo day and we got some nice press today in GigaOM and Forbes which put us on their short-lists.

And that's one demo day down... still another one tomorrow, but now they'll already be anticipating something awesome so I'm probably going to have to turn it up to 11. Maybe I'll do the Hokey Pokey for them (we had Paul Graham do the Hokey Pokey during our YC interview ;)

So what's ZigFu (for great justice). This blog has a bunch of post showing the Unity bindings for OpenNI. Over the summer we've built a set of UI Components for developing apps using motion control and the plan is to launch a portal for motion apps next.

The goal here is to integrate a ton of existing computer vision stuff into the OpenNI framework so we can track hands and skeletons and faces. We think the motion sensing technology will make a whole new market for smart TV apps, and we're forming the platform-independent vendor required to make the application layer that sites above the hardware and computer vision middleware.

We want to make the remote control obsolete.