The more you go in data analysis, the more you understand that the most suitable tool for coding and visualizing is not a pure code, or SQL IDE, or even simplified data manipulation diagrams (aka workflows or jobs). From some point you realize that you need a mix of these all – that’s what “notebook” platforms are. I have tried two most powerful of them in production use with about 20+ analytic users. My experience is described in this article.
There are a lot of monitoring systems nowadays, but working with Massively Parallel Processing (MPP) databases showed me that they are not enough to monitor complex data processing systems from both sides - data and hardware. For that purposes I found solution in combining multiple metric collecting, visualizing and alerting systems.
Most of unix administrators sooner or later meet the problem of using multiple Python versions on one system. Mostly, the reason of this is users - they want to use different versions of Python and easily switch between them. How can we help them with it?
Lets start with something simple. These are my favourite bash one-liners and scripts that saved me a lot of time.