Quantcast
Channel: Apache Spark & Scala – Edureka Blog
Browsing all 45 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

Drilling Down On Apache Drill, The New-Age Query Engine (Part 2)

In this second Apache Drill blog post, we will learn how integrate Hive and HBase with Apache Drill. Apache Drill provides inbuilt storage plugins for Hive and HBase integration. We just need to edit...

View Article


Image may be NSFW.
Clik here to view.

Spark Accumulators Explained: Apache Spark

Contributed by Prithviraj Bose Here’s a blog on the stuff that you need to know about Spark accumulators. What are accumulators? Accumulators are variables that are used for aggregating information...

View Article


Image may be NSFW.
Clik here to view.

Distributed Caching With Broadcast Variables: Apache Spark

Contributed by Prithviraj Bose Broadcast variables are useful when large datasets needs to be cached in executors. This blog explains how to get started. What are Broadcast Variables? Broadcast...

View Article

Image may be NSFW.
Clik here to view.

Stateful Transformations with Windowing in Spark Streaming

Contributed by Prithviraj BoseIn this blog we will discuss the windowing concept of Apache Spark’s stateful transformations.What is stateful transformation?Spark streaming uses a micro batch...

View Article

Spark Functional Features

The extra-ordinary functional capabilities of Apache Spark make it a standalone project from Apache Software Foundation, which comes with high processing speed and efficiency like never-before. Let’s...

View Article


5 Reasons to Learn Apache Spark

Those who have been into Big Data probably know about Spark, popularly known as the Swiss Army knife of Big Data analytics. We have talked about the different features of Spark in our previous posts....

View Article

Why Scala is getting Popular?

Market for Scala is increasing at a very fast pace. There are several reasons why Scala is the sough-after choice of programmers: Developers want more flexible languages to improve their productivity....

View Article

Image may be NSFW.
Clik here to view.

Hive & Yarn Get Electrified By Spark

In this blog, let us see how to build Spark for a specific Hadoop version. We will also learn how to build Spark with HIVE and YARN. Considering that you have Hadoop, jdk, mvn and git pre-installed...

View Article


Image may be NSFW.
Clik here to view.

Hive and Yarn Examples on Spark

We have learnt how to Build Hive and Yarn on Spark. Now let us try out Hive and Yarn examples on Spark. Hive Example on Spark We will run an example of Hive on Spark. We will create a table, load data...

View Article


Big Data Processing with Spark and Scala

Understanding Spark & Scala In this era of ever growing data, the need for analyzing it for meaningful business insights becomes more and more significant. There are different Big Data processing...

View Article

Image may be NSFW.
Clik here to view.

Apache Spark vs Hadoop MapReduce

Anybody working in the area of Big Data will know what MapReduce does and what its shortcomings are. It is not completely fair to say that there are shortcomings because MapReduce along with HDFS was...

View Article

Image may be NSFW.
Clik here to view.

Cumulative Stateful Transformation In Apache Spark Streaming

Contributed by Prithviraj Bose In my previous blog I have discussed stateful transformations using the windowing concept of Apache Spark Streaming. You can read it here. In this post I am going to...

View Article

Image may be NSFW.
Clik here to view.

Spark SQL Tutorial – Understanding Spark SQL With Examples

Apache Spark is a lightning-fast cluster computing framework designed for fast computation. It is of the most successful projects in the Apache Software Foundation. Spark SQL is a new module in Spark...

View Article


Image may be NSFW.
Clik here to view.

Spark Accumulators Explained: Apache Spark

Contributed by Prithviraj Bose Here’s a blog on the stuff that you need to know about Spark accumulators. What are accumulators? Accumulators are variables that are used for aggregating information...

View Article

Image may be NSFW.
Clik here to view.

Spark Tutorial: Real Time Cluster Computing Framework

Apache Spark is an open-source cluster computing framework for real-time processing. It is of the most successful projects in the Apache Software Foundation. Spark has clearly evolved as the market...

View Article


Image may be NSFW.
Clik here to view.

Spark Streaming Tutorial – Sentiment Analysis Using Apache Spark

Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams. Spark Streaming can be used to stream live data and...

View Article

Image may be NSFW.
Clik here to view.

Spark MLlib – Machine Learning Library Of Apache Spark

Spark MLlib is Apache Spark’s Machine Learning component. One of the major attractions of Spark is the ability to scale computation massively, and that is exactly what you need for machine learning...

View Article


Image may be NSFW.
Clik here to view.

Spark GraphX Tutorial – Graph Analytics In Apache Spark

GraphX is Apache Spark’s API for graphs and graph-parallel computation. GraphX unifies ETL (Extract, Transform & Load) process, exploratory analysis and iterative graph computation within a single...

View Article

Image may be NSFW.
Clik here to view.

Data Scientist Skills – What Does It Take To Become A Data Scientist?

Data Scientist Skills: Data science is an umbrella term that encompasses data analytics, data mining, Artificial Intelligence, machine learning, Deep Learning and several other related disciplines. In...

View Article

Image may be NSFW.
Clik here to view.

Introduction to Spark with Python – PySpark for Beginners

Apache Spark is one the most widely used framework when it comes to handling and working with Big Data AND Python is one of the most widely used programming languages for Data Analysis, Machine...

View Article
Browsing all 45 articles
Browse latest View live


Latest Images