In 2005 Stonebraker et al. published a paper that outlined 8 key requirements
for stream processing architecture [http://dl.acm.org/citation.cfm?id=1107504].
These key requirements can be easily translated into building blocks of stream
processing architecture. Although, this article dates before systems...
Apache Mesos [http://mesos.apache.org/] is a popular open source cluster manager
which enables building resource-efficient distributed systems. Mesos provides
efficient dynamic resources isolation and sharing across multiple distributed
applications such as Hadoop, Spark, Memcache, MySQL etc on a dynamic shared pool
of resources...
My notes and thoughts on Hadoop Ecosystem from book Hadoop Operations[1].
One of the major key take aways is emergence of the Hadoop cluster deployment
and management tools such as hstack and Apache AMBARI. In our own setup we
managed to deploy and scale...
Notes plus thoughts from my recent read Cassandra: The Definitive Guide. Common
ways to solve scalability bottleneck with relational databases,
Throw More/better Hardware (memory And Cpu)
* Vertical scaling
* Faster disks (SSD vs RAID)
Move To A Database Cluster
* With master-slave configuration:
* Master is now...
It is good to know your data. But there is clear distinction between being data
driven vs data informed. No matter which area you work, there is always an
opportunity to make additional gains by closely observing the characteristic and
quality of your data. By...