Apache Spark & Scala – Edureka Blog

Channel: Apache Spark & Scala – Edureka Blog

↧

Image may be NSFW.
Clik here to view.

Drilling Down On Apache Drill, The New-Age Query Engine (Part 2)

April 5, 2016, 6:59 am

In this second Apache Drill blog post, we will learn how integrate Hive and HBase with Apache Drill. Apache Drill provides inbuilt storage plugins for Hive and HBase integration. We just need to edit...

View Article

Image may be NSFW.
Clik here to view.

Spark Accumulators Explained: Apache Spark

April 13, 2016, 5:11 am

Contributed by Prithviraj Bose Here’s a blog on the stuff that you need to know about Spark accumulators. What are accumulators? Accumulators are variables that are used for aggregating information...

View Article

Image may be NSFW.
Clik here to view.

Distributed Caching With Broadcast Variables: Apache Spark

May 19, 2016, 8:09 am

Contributed by Prithviraj Bose Broadcast variables are useful when large datasets needs to be cached in executors. This blog explains how to get started. What are Broadcast Variables? Broadcast...

View Article

Image may be NSFW.
Clik here to view.

Stateful Transformations with Windowing in Spark Streaming

June 16, 2016, 7:51 am

Contributed by Prithviraj BoseIn this blog we will discuss the windowing concept of Apache Spark’s stateful transformations.What is stateful transformation?Spark streaming uses a micro batch...

View Article

Spark Functional Features

September 22, 2014, 4:24 am

The extra-ordinary functional capabilities of Apache Spark make it a standalone project from Apache Software Foundation, which comes with high processing speed and efficiency like never-before. Let’s...

View Article

5 Reasons to Learn Apache Spark

September 30, 2014, 5:22 am

Those who have been into Big Data probably know about Spark, popularly known as the Swiss Army knife of Big Data analytics. We have talked about the different features of Spark in our previous posts....

View Article

Why Scala is getting Popular?

October 7, 2014, 1:25 am

Market for Scala is increasing at a very fast pace. There are several reasons why Scala is the sough-after choice of programmers: Developers want more flexible languages to improve their productivity....

View Article

Image may be NSFW.
Clik here to view.

Hive & Yarn Get Electrified By Spark

December 29, 2014, 12:54 am

In this blog, let us see how to build Spark for a specific Hadoop version. We will also learn how to build Spark with HIVE and YARN. Considering that you have Hadoop, jdk, mvn and git pre-installed...

View Article

Image may be NSFW.
Clik here to view.

Hive and Yarn Examples on Spark

February 16, 2015, 10:29 pm

We have learnt how to Build Hive and Yarn on Spark. Now let us try out Hive and Yarn examples on Spark. Hive Example on Spark We will run an example of Hive on Spark. We will create a table, load data...

View Article

Big Data Processing with Spark and Scala

May 26, 2015, 6:14 am

Understanding Spark & Scala In this era of ever growing data, the need for analyzing it for meaningful business insights becomes more and more significant. There are different Big Data processing...

View Article

Image may be NSFW.
Clik here to view.

Apache Spark vs Hadoop MapReduce

December 18, 2015, 5:10 am

Anybody working in the area of Big Data will know what MapReduce does and what its shortcomings are. It is not completely fair to say that there are shortcomings because MapReduce along with HDFS was...

View Article

Image may be NSFW.
Clik here to view.

Cumulative Stateful Transformation In Apache Spark Streaming

June 24, 2016, 6:58 am

Contributed by Prithviraj Bose In my previous blog I have discussed stateful transformations using the windowing concept of Apache Spark Streaming. You can read it here. In this post I am going to...

View Article

Image may be NSFW.
Clik here to view.

Spark SQL Tutorial – Understanding Spark SQL With Examples

January 1, 2017, 10:06 pm

Apache Spark is a lightning-fast cluster computing framework designed for fast computation. It is of the most successful projects in the Apache Software Foundation. Spark SQL is a new module in Spark...

View Article

Image may be NSFW.
Clik here to view.

Spark Accumulators Explained: Apache Spark

April 13, 2016, 5:11 am

View Article

Image may be NSFW.
Clik here to view.

Spark Tutorial: Real Time Cluster Computing Framework

May 4, 2017, 8:21 am

Apache Spark is an open-source cluster computing framework for real-time processing. It is of the most successful projects in the Apache Software Foundation. Spark has clearly evolved as the market...

View Article

Image may be NSFW.
Clik here to view.

Spark Streaming Tutorial – Sentiment Analysis Using Apache Spark

May 8, 2017, 7:23 am

Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams. Spark Streaming can be used to stream live data and...

View Article

Image may be NSFW.
Clik here to view.

Spark MLlib – Machine Learning Library Of Apache Spark

May 10, 2017, 4:03 am

Spark MLlib is Apache Spark’s Machine Learning component. One of the major attractions of Spark is the ability to scale computation massively, and that is exactly what you need for machine learning...

View Article

Image may be NSFW.
Clik here to view.

Spark GraphX Tutorial – Graph Analytics In Apache Spark

May 12, 2017, 5:12 am

GraphX is Apache Spark’s API for graphs and graph-parallel computation. GraphX unifies ETL (Extract, Transform & Load) process, exploratory analysis and iterative graph computation within a single...

View Article

Image may be NSFW.
Clik here to view.

Data Scientist Skills – What Does It Take To Become A Data Scientist?

June 11, 2018, 7:31 am

Data Scientist Skills: Data science is an umbrella term that encompasses data analytics, data mining, Artificial Intelligence, machine learning, Deep Learning and several other related disciplines. In...

View Article

Image may be NSFW.
Clik here to view.

Introduction to Spark with Python – PySpark for Beginners

June 13, 2018, 3:05 am

Apache Spark is one the most widely used framework when it comes to handling and working with Big Data AND Python is one of the most widely used programming languages for Data Analysis, Machine...

View Article

Latest Images