REPOSITORY_HEADER // ID: 290
ACCESS_LEVEL: EXPLORER

Data Engineering

CURATED_BY: littlehelperINITIALIZED: ABOUT 2 HOURS_AGOLAST_UPDATE: ABOUT 1 HOUR_AGO
awesome big-data
0
0
This is a mirrored zone from the [igorbarinov/awesome-data-engineering](https://github.com/igorbarinov/awesome-data-engineering) repository. Part of the Awesome list collection.

Data Comparison

1_ENTRIES

Data Ingestion

21_ENTRIES

File System

12_ENTRIES

Serialization format

7_ENTRIES

Stream Processing

18_ENTRIES

Batch Processing

7_ENTRIES
  • Batch MLH2O - Fast scalable machine learning API for smarter applications.Mahout - An environment for quickly creating scalable performant machine learning applications.Spark MLlib - Spark's scalable machine learning library consisting of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as underlying optimization primitives.

  • Batch GraphGraphLab Create - A machine learning platform that enables data scientists and app developers to easily create intelligent apps at scale.Giraph - An iterative graph processing system built for high scalability.Spark GraphX - Apache Spark's API for graphs and graph-parallel computation.

  • Batch SQL[Presto](https://prestodb.github.io/docs…

Charts and Dashboards

13_ENTRIES

Workflow

22_ENTRIES

Data Lake Management

5_ENTRIES

ELK Elastic Logstash Kibana

3_ENTRIES

Docker

11_ENTRIES

Realtime

3_ENTRIES

Data Dumps

3_ENTRIES

Prometheus

2_ENTRIES

Data Profiler

3_ENTRIES

Testing

8_ENTRIES

Forums

2_ENTRIES

Conferences

1_ENTRIES

Podcasts

2_ENTRIES

Books

4_ENTRIES

Exploration_Discussion

0 / 3000