|
Written by Oliver Ratzesberger
|
|
Monday, 21 April 2008 00:00 |
|
Turning utility computing into a service model for analytics. With the needs of Enterprise Analytics growing at ever increasing speeds, it becomes clear that traditional hub and spoke architectures are in no way able to sustain the demands driven by increasingly complex business analytics. As with any proliferation of systems the overhead of managing, maintaining and developing trees of increasingly complex dependencies quickly out paces the ability of an organization to deal with its challenges. What may work well at first turns into a real evolution nightmare. It rapidly becomes more and more difficult to react to ongoing changes in business demands and growth. For many years, some of the largest corporations in the world have realized this and have focused on re-integrating islands and stovepipes of information into much more centralized analytical infrastructures. Quite often however, this is also seen as a step towards reducing flexibility - in terms of time to market - for individual groups to quickly deliver to rapidly changing business demands. It's a very typical love-hate relationship with these so-called departmental systems or data marts. Great for a localized team to 'bang' out new capabilities, but becomes a data integration nightmare with huge TCO (Total Cost of Ownership) implications, that are quite often not visible to the overall organization. |
|
Last Updated ( Monday, 21 April 2008 22:40 )
|
|
Read more...
|
|
|
To provide you with a little background of what types of systems we are working on we felt it would be beneficial to share some high level stats about our infrastructure. Incoming data volumes exceed 50TB per day, with more than 10^11 new items/lines/records being added per day. Our analytical processing infrastructure exceeds 12PB of physical storage with over 4.5PB in our largest cluster. We leverage compression technologies wherever possible and are achieving compression ratios as high as 96% on our highest volume data feeds. |
|
Read more...
|
|
|
xlmpp is a multi author blog about the latest trends in extreme large scale massive parallel processing (MPP). This site is not about products or vendors but about approaches, architecture, algorithms, the how, the what and most importantly: what to avoid, not to do. Extreme large data volumes present very unique challenges. Processing 100s-1000s of billions of records or rows or lines of text, whether inside a database or not, require not only massive parallel systems but a great amount of attention to detail. |
|
Read more...
|
|
|