Blog

Training

Kafka & Distributed Streaming

Apache Kafka is a multi-purpose distributed streaming platform.  Kafka can be used to not only build streaming data pipelines that reliably get data between systems or applications, but Kafka can also be used to build streaming applications that transform, analyze or react to streams of data. Kafka supports the use of both publish/subscribe, point-to-point or custom … Continue reading Kafka & Distributed Streaming

Future-Proof, Centralized Cybersecurity

  Metron is an amalgamation and augmentation of several open-source ASF projects that provides a centralized management capability for security monitoring and analysis for the identification and disposition of any level of a cyberthreat. Metron provides capabilities for log aggregation, full packet capture indexing, storage, advanced behavioral analytics and data enrichment.  Metron is a single … Continue reading Future-Proof, Centralized Cybersecurity

Build Your Own Stack

ASF Service Stack Hadoop would not have become synonymous with “Big Data” had it not been for the pioneering work and marketing efforts of companies such as MapR, Cloudera and Hortonworks. Each of these organizations made the concurrent use of a number of ASF distributed, component-based services accessible by bundling these services in a deployable … Continue reading Build Your Own Stack

Comprehensive Flink

Apache Flink is an open source platform for distributed stream and batch data processing Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams. Flink builds batch processing on top of the streaming engine, overlaying native iteration support, managed memory, and program optimization. This … Continue reading Comprehensive Flink

Train on Advanced Hadoop (Hadoop 3)

Train on Advanced Hadoop (Hadoop 3)   The Apache Hadoop software library is a framework that allows for the distributed storage and analysis of large data sets across clusters of computers supporting simple-to-complex programming models. Designed to scale up from single servers to thousands of machines, Advanced Hadoop offers a robust local computation and storage … Continue reading Train on Advanced Hadoop (Hadoop 3)

Software Data Flow Components for Network Engineers

Network engineers working with hardware components often must work closely with software engineers concerned with “Big Data”, “IoT”, “IoAT” or the “Cloud”. This course relies upon the detailed hardware-based network component knowledge of network engineers and augments that knowledge with the introduction of several open-source, Apache Software Foundation (ASF) components, that when accessing the data … Continue reading Software Data Flow Components for Network Engineers

Blockchain for Enterprise

While cryptocurrencies in general and Bitcoin in particular are prominent applications of blockchain, this course explains blockchain as a general technology. This approach highlights generic concepts and technical patterns of blockchain instead of focusing on a specific application case. This course will present a non-technical introduction to blockchain fundamentals. This course fills the gap between … Continue reading Blockchain for Enterprise