Contributor: Nicolas A Perez

MapR Converge Blog author, Nicolas A Perez

Nicolas is very interested in Apache Spark, Hadoop, distributed systems, algorithms, and functional programming, especially in the Scala programming language. In the past, Nicolas has done a lot of programming and engineering in C# and the .NET Framework. Previous work includes payment processing systems, POS systems, and mobile systems. He had also worked on implementing Royal Caribbean Cruises’ microservices vision using Lightbend Reactive Platform (Lagom, Cassandra, Kafka, Solr, etc.) with Java, Docker, and cloud technologies. Nowadays, Nicolas works as a Sr. Data Engineer at MapR, where he uses Scala and Java extensively and collaborates with multiple clients on designing and building custom solutions to solve real-life problems at scale. He is also the Miami Scala Meetup Organizer.

Blog Posts by Nicolas A Perez

May 01, 2019 | By Nicolas A Perez

Embedded In-Memory Implementation of OJAI Driver for Testing

Writing the appropriate tests should always be a priority during the development cycle of any team. We really encourage to test first (TDD) and to test all, or at least as much as we can. When writing applications that rely on MapR Database, we might...

Read more
April 17, 2019 | By Nicolas A Perez

Extending MapR Database Queries Using Scala Polymorphic Types

When working with MapR Database, there are limitless ways to interact with it as we have explored in previous posts, such as Interacting with MapR Database, MapR Database Spark Connector with Secondary Indexes Support, and MapR Database Atomic Document...

Read more
April 03, 2019 | By Nicolas A Perez

Transaction Support and Smart Joins for MapR Database

Sometimes there are limits around how much we can stretch certain technology. In the case of MapR Database, these limits never seem to get closer, while we add more and more capabilities on top it. Previously, we have talked about many things we can...

Read more
March 14, 2019 | By Nicolas A Perez

Spark Custom Streaming Sources

Originally posted January 14, 2019, here. Editor's Note: Download this Free eBook: Getting Started with Apache Spark 2.x – From Inception to Production Apache Spark is one of the most versatile big data frameworks out there. The ability to mix different...

Read more
March 08, 2019 | By Nicolas A Perez

MapR Database Spark Connector with Secondary Indexes Support

This blog post was originally published on Medium. MapR Data Platform offers significant advantages over any other tool on the big data space. MapR Database is one of the core components of the platform, and it offers state-of-the-art capabilities that...

Read more
March 07, 2019 | By Nicolas A Perez

MPI Workloads Performance on the MapR Data Platform, Part 2 – Matrix Multiplication

Originally published 2/12/19 on Medium In part one of this blog series, we showed how we can use MPI on top of the MapR Data Platform to successfully find prime numbers within a rather large range. Also, we compared our sieve of Eratosthenes implementation...

Read more
February 28, 2019 | By Nicolas A Perez

MPI Workloads Performance on the MapR Data Platform, Part 1

Originally posted 2/11/19 on Medium. In the big data world, the MapR Data Platform occupies, without question, an important place given the technology advantages it offers. The ability to run mixed workloads that includes continuous data processing with...

Read more
February 21, 2019 | By Nicolas A Perez

MapR Database Atomic Document Updates

Originally published 12/15/18 on Medium In a previous post, we discussed some of the features of MapR Database that make this distributed database especially interesting. In this blog post, we intend to continue that effort by presenting a specific use...

Read more
August 23, 2016 | By Nicolas A Perez

Apache Spark Packages, from XML to JSON

The Apache Spark community has put a lot of effort into extending Spark. Recently, we wanted to transform an XML dataset into something that was easier to query. We were mainly interested in doing data exploration on top of the billions of transactions...

Read more
July 28, 2016 | By Nicolas A Perez

A Functional Approach to Logging in Apache Spark

A Functional Approach to Logging in Apache Spark Logging in Apache Spark is very easy to do, since Spark offers access to a logobject out of the box; only some configuration setups need to be done. In a previous post, we looked at how to do this while...

Read more
May 10, 2016 | By Nicolas A Perez

How to Integrate Custom Data Sources Into Apache Spark

Streaming data is a hot topic these days, and Apache Spark is an excellent framework for streaming. In this blog post, I'll show you how to integrate custom data sources into Spark. Spark Streaming gives us the ability to stream from a variety of...

Read more
April 19, 2016 | By Nicolas A Perez

Spark Streaming and Twitter Sentiment Analysis

This blog post is the result of my efforts to show to a coworker how to get the insights he needed by using the streaming capabilities and concise API of Apache Spark. In this blog post, you'll learn how to do some simple, yet very interesting analytics...

Read more
March 23, 2016 | By Nicolas A Perez

Spark Data Source API: Extending Our Spark SQL Query Engine

In my last post, Apache Spark as a Distributed SQL Engine, we explained how we could use SQL to query our data stored within Hadoop. Our engine is capable of reading CSV files from a distributed file system, auto discovering the schema from the files...

Read more
March 17, 2016 | By Nicolas A Perez

Apache Spark as a Distributed SQL Engine

SQL has been here for awhile and people like it. However, the engines that power SQL have changed with time in order to solve new problems and keep up with demands from consumers. Traditional engines such as Microsoft SQL Server had some problems with...

Read more
March 01, 2016 | By Nicolas A Perez

How to Log in Apache Spark

An important part of any application is the underlying log system we incorporate into it. Logs are not only for debugging and traceability, but also for business intelligence. Building a robust logging system within our apps could be use as a great insights...

Read more
Categories

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.


Get our latest posts in your inbox

Subscribe Now