Parallel access to external data sources from Greenplum DB using PXF

Parallel accessGreenplum 5.0.0 brought us a lot of new features. Most of them were planned a long time ago, but couldn't be implemented without breaking backward binary compatibility, which cannot be done in 4.X major branch. One of such features is a new PXF framework. It allows you to integrate Greenplum cluster with other systems - databases, in-memory grids, Hadoop components, etc. Moreover, it can do it in parallel - all Greenplum segments can retrieve its personal shards of data.

Monitoring MPP systems

MPP monitoringThere are a lot of monitoring systems nowadays, but working with Massively Parallel Processing (MPP) databases showed me that they are not enough to monitor complex data processing systems from both sides - data and hardware. For that purposes I found solution in combining multiple metric collecting, visualizing and alerting systems.