microsoft / SynapseML
Simple and Distributed Machine Learning
See what the GitHub community is most excited about today.
Simple and Distributed Machine Learning
Apache OpenWhisk is an open source serverless cloud platform
Apache Kyuubi is a distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark
State of the Art Natural Language Processing
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
Modern Load Testing as Code
Apache Spark - A unified analytics engine for large-scale data processing
The Streaming-first HTTP server/module of Akka
FireSim: Easy-to-use, Scalable, FPGA-accelerated Cycle-accurate Hardware Simulation in the Cloud
ZIO β A type-safe, composable library for async and concurrent programming in Scala
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Play Framework
CMAK is a tool for managing Apache Kafka clusters
Build highly concurrent, distributed, and resilient message-driven applications on the JVM
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
DataStax Spark Cassandra Connector
A fault tolerant, protocol-agnostic RPC system
The Daml smart contract language
A Scala API for Apache Beam and Google Cloud Dataflow.
In-memory message queue with an Amazon SQS-compatible interface. Runs stand-alone or embedded.
Kaitai Struct: compiler to translate .ksy => .cpp / .cs / .dot / .go / .java / .js / .lua / .nim / .php / .pm / .py / .rb
RISC-V Torture Test
SonicBOOM: The Berkeley Out-of-Order Machine
Macro description format
Flexible Intermediate Representation for RTL