The YARN Log Aggregation option aggregates and moves log files for completed applications from the local file system to the MapR-FS. This allows users to view the entire set of logs for a particular application using the HistoryServer UI or by running the
yarn logs command.
By default, YARN container logs are not aggregated on the MapR-FS. Instead, the logs are retained for 3 hours on the local file system before they are deleted. To enable YARN log aggregation or to edit the configuration of YARN log aggregation, you must edit the yarn-site.xml file in the following directory:
This section contains information about how to complete the following tasks:
Enabling YARN Log Aggregation
To enable YARN log aggregation, add or edit the following properties in
- Set the value of
Optional: Set the value of
yarn.nodemanager.remote-app-log-dirto a location in the MapR-FS. By default, the location is
Optional: Set the value of
yarn.nodemanager.remote-app-log-dir-suffixto the name of the folder that should contain the logs for each user. By default, the folder name is
On a non-secure cluster, you also need to add the following property to
/opt/mapr/hadoop/hadoop-2.x/etc/hadoop/yarn-env.sh on the Node Manager nodes:
Then restart the Node Manager services. This setting enables impersonation for Node Manager processes so that log files can be created with the correct user ownership.
Aggregated logs are owned by the user who runs the job. For example, when user
admin runs a job, the logs are stored to
maprfs:///tmp/logs/admin. If user
analyst runs a job, the logs are stored to
maprfs:///tmp/logs/analyst. If these two users do not share the same UNIX group, they will not be able to see each other's logs.
Viewing Logs for Completed Applications
With YARN log aggregation, you can use yarn commands or the HistoryServer UI to access logs for completed applications.
Using the Command Line to View Logs for Completed Applications:
Determine the application ID for the application that you want to view the logs for.
For example, run the following command to list the applications:
yarn logscommand to view the logs for the application.
For example, run the following command to view the log files for application
Using the HistoryServer UI to View Logs for Completed Applications:
- Log on to the MapR Control System.
- In the Navigation Pane, click JobHistoryServer
- Click the Job ID link for the job that you want to view the logs for.
- In the Logs column of the Application Master section, click the logs link.
Editing the Retention Settings of Aggregated Logs
By default, aggregate logs are stored on the MapR-FS for 30 days. The retention time for aggregated logs also applies to centralized logs.
To edit the retention settings, add or edit the following properties in yarn-site.xml:
Set the value of
yarn.log-aggregation.retain-secondsto set the duration that the logs are maintained. If you set a negative value for
yarn.log-aggregation.retain-seconds, logs will not be deleted.
The duration specified by
yarn.log-aggregation.retain-seconds starts from the time that the application starts running. Therefore, when you configure the duration, consider how long you want the log to remain in addition to the amount of time that the application will take to run. For example, if you expect most applications to take 20 seconds to run, do not set the value of this property to 20 seconds because the log might be deleted as soon as the applications completes.
Optionally, set the
yarn.log-aggregation.retain-check-interval-seconds tospecify how often the log retention check should be run. By default, it is one-tenth of the log retention time.
For more details about the properties that impact the YARN container logs and the aggregation option, see yarn-site.xml.