Any reasonable (web) application which deals with some subset of web data (rss feeds, web pages, product pricing data etc.), has to use distributed data processing to go anywhere. Unless you have very deep pockets and / or have strong VC funding (which is rarer than a bottle of Mouton Rothschild Pauillac Premier Cru First [...]