Hadoop Streaming

Tue 15 September 2015
Hadoop Streaming Made Simple using Joins and Keys with Python

BigData Data Visualization links

Mon 14 September 2015
Arbor: graph visualization library using web workers and jQuery Bokeh is a Python interactive visualization library for large datasets that natively uses the latest web technologies. Its goal is to provide elegant, concise construction of novel graphics in the style of Protovis/D3, while delivering high-performance interactivity over large data ...

R Programming links

Mon 14 September 2015
Spatial Microsimulation with R

Spark Links

Mon 14 September 2015
Articles A Docker Image for Graph Analytics on Neo4j with Apache Spark GraphX

Go proramming Links

Mon 14 September 2015
Articles An incomplete list of Go tools Probabilistic Data Structures for Go

Hive useful Links

Mon 14 September 2015
Articles Using GenericUDFs to return multiple values in Apache Hive

SonarQube in Docker environment

Mon 14 September 2015
Install Pull Docker images for PostgreSQL and SonarQube $ docker pull postgres:9.4 $ docker pull sonarqube:5.1.2 Configuration Database configuration By default, the image will use an embedded H2 database that is not suited for production. The production database is configured with these variables: SONARQUBE_JDBC_USERNAME SONARQUBE_JDBC_PASSWORD SONARQUBE_JDBC_URL $ docker ...

Docker Links

Mon 14 September 2015
Core Dockerfile Project Trusted Automated Docker Builds Docker GitHub repositories Libraries docker/libcompose A Go library for Docker Compose. It does everything the command-line tool does, but from within Go: read Compose files, start them, scale them, etc. Management Panamax: Docker Management for Humans An open-source project that makes deploying ...

Python Links

Mon 14 September 2015
Parallel execution ipyparallel Interactive Parallel Computing in Python http://ipyparallel.readthedocs.org/ Interactive Python IPython provides a rich architecture for interactive computing with: A kernel for Jupyter. A powerful interactive shell. Support for interactive data visualization and use of GUI toolkits. Flexible, embeddable interpreters to load into your own projects ...

BigData Testing

Mon 14 September 2015
Slides Testing Big Data: Automated Testing of Hadoop with QuerySurge