Home
Introducing xlmpp

xlmpp is a multi author blog about the latest trends in extreme large scale massive parallel processing (MPP).

This site is not about products or vendors but about approaches, architecture, algorithms, the how, the what and most importantly: what to avoid, not to do. Extreme large data volumes present very unique challenges.

Processing 100s-1000s of billions of records or rows or lines of text, whether inside a database or not, require not only massive parallel systems but a great amount of attention to detail.

Whether you opt to build or buy or combine both, there are many fundamental basics that make or break successful implementations. It is not only about the number of servers you combine in clusters or clouds, its primarily about the effectiveness of algorithms and system architecture.

Massive Parallel Processing - Taking Super Computing to the Extreme
After many years of working in this area behind closed doors we have finally the opportunity to share some of our work with the general public. As you will see over the coming months we are a group of opinionated, passionate folks that love what we are doing.

No topic is off limits (except for confidential information) and we will see us critique our work as well as that of others. When we do so we do it based on our experience and beliefs. We will engage in discussions and understand there are always multiple sides to the same story. We will run into topics that are against our fundamental believes, yet we will attempt to understand why others have gone there. 

Our individual beliefs won't always be the same and that's what makes use such a great team. We don't always agree on various approaches but we have learned to "disagree and commit" to a single goal - even if we are not all convinced. We don't argue for the sake of arguing and we have learned that too much philosophy will kill or at least delay any project or implementation.

We will implement the most appropriate solution for the need, be it off the shelf software, open source, self written, or some integration of all of them.  Most of what we do is based on LINUX or readily available UNIX derivates. Not that we have a particular preference other than wanting to stick to standards as much as possible - as long as it makes sense to do so. Many of us have worked with a wide range of systems, including mainframes and various mid and large scale computing environments.

Large scale computer systems are rapidly growing and so are the Data Centers that host them

This blog will be primarily focused on large scale, data intensive applications like traditional Data Warehousing, Decision Support, Analytics, Data Mining, Data Transformation, Historical deep storage and less on Online Transaction Processing or related application technologies. While OLTP is an interesting topic in itself, it is very different from our core processing requirements. We might - on occasion - compare our approaches to traditional transaction processing, but we will do so primarily to point out differences and why we had to choose a different approach.

As I have mentioned before, we will not be talking about vendors or products. There are plenty of places that you can find product specific information - no need to create another platform for that. In addition our relationship with our employer and various vendors, will not allow us to talk about product specifics.

We hope you will find useful information on www.xlmpp.com and come back often. We encourage you to bookmark our blog and subscribe to our news feed.

After all, when was the last time you processed more than 25PB in less than 24h?

See you again soon!

 
We have 43 guests online