Contributor: Aaron Eng

MapR Converge Blog author, Aaron Eng

Aaron is Principal Support Engineer at MapR. Aaron is an expert on Big Data systems, distributed storage/computing/applications, Hadoop & Map/Reduce, HBase, and management, monitoring and administration of large computing clusters. Prior to MapR, Aaron was an Escalation Engineer for both Riverbed Technology and EMC, working directly with engineering to resolve complex issues.

Blog Posts by Aaron Eng

June 20, 2019 | By Aaron Eng

Anatomy of a Memory Cell: Understanding Performance in a Distributed Computing Environment

Note: This article references commands, behaviors, and outputs generated by Linux-based operating systems, such as CentOS or Ubuntu. Some information will not be relevant to other operating systems, such as Windows. When describing computer instruction...

Read more
March 27, 2019 | By Aaron Eng

An Introduction to Computer Processors, Part 4

Note: This article references commands, behaviors, and outputs generated by Linux-based operating systems, such as CentOS or Ubuntu. Some information will not be relevant to other operating systems, such as Windows. This is a group of four blog posts...

Read more
March 20, 2019 | By Aaron Eng

An Introduction to Computer Processors, Part 3

Note: This article references commands, behaviors, and outputs generated by Linux-based operating systems, such as CentOS or Ubuntu. Some information will not be relevant to other operating systems, such as Windows. This is a group of four blog posts...

Read more
March 13, 2019 | By Aaron Eng

An Introduction to Computer Processors, Part 2

Note: This article references commands, behaviors, and outputs generated by Linux-based operating systems, such as CentOS or Ubuntu. Some information will not be relevant to other operating systems, such as Windows. This is a group of four blog posts...

Read more
March 06, 2019 | By Aaron Eng

An Introduction to Computer Processors, Part 1

Note: This article references commands, behaviors, and outputs generated by Linux-based operating systems, such as CentOS or Ubuntu. Some information will not be relevant to other operating systems, such as Windows. Editor's Note: This blog post...

Read more
March 01, 2019 | By Aaron Eng

An Introduction to Disk Storage

Note: This article references commands, behaviors, and outputs generated by Linux-based operating systems, such as CentOS or Ubuntu. Some information will not be relevant to other operating systems, such as Windows. This blog post is the second in a...

Read more
February 13, 2019 | By Aaron Eng

A Study of Performance in Distributed Computing Environments

This is the first in a series of articles about observing, understanding, and troubleshooting the behaviors of large-scale distributed systems. The State of Modern Computing At a high level, computing is simple. You have input; you apply some computations...

Read more
January 02, 2014 | By Aaron Eng

Five Steps to Avoiding Java Heap Space Errors

Keeping these five steps in mind can save you a lot of headaches and avoid Java heap space errors. Calculate memory needed. Check that the JVMs have enough memory for the TaskTracker tasks. Check that the JVMs settings are suitable for your tasks. Limit...

Read more
December 30, 2013 | By Aaron Eng

How to Minimize the Performance Impact of Re-Replication

For various reasons, data may need to be re-replicated between nodes in a MapR cluster. For example, if a disk goes bad, the content it stored will need to be re-replicated to ensure that the data is fully protected. Decommissioning a node is another...

Read more
December 20, 2013 | By Aaron Eng

Understanding MapReduce Input Split Sizes and MapR-FS (Now called MapR XD) Chunk Sizes

Improving performance by letting MapR XD do the right thing The performance of your MapReduce jobs depends on a lot of factors. In this post, we'll talk about the relationship of MapReduce input split sizes and MapR XD chunk sizes, and how they...

Read more
December 16, 2013 | By Aaron Eng

Understanding Memory Utilization on MapR Cluster Nodes

There is much more to memory calculations than just the “used” and “free” states. Here’s a quick primer on understanding memory utilization as reported by the MapR framework. The MapR framework shows memory utilization on nodes based on two values...

Read more
Categories

50,000+ of the smartest have already joined!

Stay ahead of the bleeding edge...get the best of Big Data in your inbox.


Get our latest posts in your inbox

Subscribe Now