BigData Benchmarking links
Tue 28 July 2015Apache Hadoop Benchmarking: micro-benchmarks for testing Hadoop performances Berkeley SWIM Benchmark: real-world big data workload benchmark Big-Bench: Big Bench Workload Development Hive-benchmarks: some benchmarking queries for Apache Hive Hive-testbench: Testbench for experimenting with Apache Hive at any data scale. Intel HiBench: a Hadoop benchmark suite Netflix Inviso: performance focused Big ...
BigData Columnar Databases links
Tue 28 July 2015Amazon RedShift: data warehouse service, based on PostgreSQL C-Store: column oriented DBMS Google BigQuery: framework for interactive analysis, implementation of Dremel Google Dremel: framework for interactive analysis, implementation of Dremel MonetDB: column store database Parquet: columnar storage format for Hadoop Pivotal Greenplum: purpose-built, dedicated analytic data warehouse Vertica: is designed ...
BigData Data Warehouse links
Tue 28 July 2015Google Mesa: highly scalable analytic data warehousing system IBM BigInsights: data processing, warehousing and analytics Microsoft Cosmos: Microsoft's internal BigData analysis platform
BigData Distributed Filesystem links
Tue 28 July 2015Apache HDFS: a way to store large files across multiple machines BeeGFS: formerly FhGFS, parallel distributed file system Ceph Filesystem: software storage platform designed Disco DDFS: distributed filesystem Facebook Haystack: object storage system Google Colossus: distributed filesystem (GFS2) Google GFS: distributed filesystem Google Megastore: scalable, highly available storage GridGain: GGFS ...
BigData Distributed Programming links
Tue 28 July 2015AddThis Hydra: distributed data processing and storage system originally developed at AddThis Akela: Mozilla's utility library for Hadoop, HBase, Pig, etc. AMPLab SIMR: run Spark on Hadoop MapReduce v1 AMPLab Succinct: Enabling Queries on Compressed Data Apache Crunch: Java library provides a framework for writing, testing, and running MapReduce ...
BigData Embedded Databases links
Tue 28 July 2015Actian PSQL: ACID-compliant DBMS developed by Pervasive Software, optimized for embedding in applications BerkeleyDB: a software library that provides a high-performance embedded database for key/value data HamsterDB: transactional key-value database HanoiDB: Erlang LSM BTree Storage LevelDB: a fast key-value storage library written at Google that provides an ordered mapping ...
BigData Frameworks links
Tue 28 July 2015Apache Hadoop: framework for distributed processing. Integrates MapReduce (parallel processing), YARN (job scheduling) and HDFS (distributed file system)
BigData Graph Data Model links
Tue 28 July 2015Apache Giraph: implementation of Pregel, based on Hadoop Apache Spark Bagel: implementation of Pregel, part of Spark ArangoDB: multi model distribuited database Facebook TAO: TAO is the distributed data store that is widely used at facebook to store and serve the social graph Faunus: Hadoop-based graph analytics engine for analyzing ...
BigData Key-Map Data Model links
Tue 28 July 2015Actian Vector: column-oriented analytic database Apache Accumulo: distribuited key/value store, built on Hadoop Apache Cassandra: column-oriented distribuited datastore, inspired by BigTable Apache HBase: column-oriented distribuited datastore, inspired by BigTable Facebook HydraBase: evolution of HBase made by Facebook Google BigTable: column-oriented distributed datastore Google Cloud Datastore: is a fully managed ...