DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Controlling User Logging in Hadoop

Controlling User Logging in Hadoop

Alex Holmes user avatar by
Alex Holmes
·
Jan. 24, 13 · Interview
Like (0)
Save
Tweet
Share
7.40K Views

Join the DZone community and get the full member experience.

Join For Free

imagine that you’re a hadoop administrator, and to make things interesting you’re managing a multi-tenant hadoop cluster where data scientists, developers and qa are pounding your cluster. one day you notice that your disks are filling-up fast, and after some investigating you realize that the root cause is your mapreduce task attempt logs.

how do you guard against this sort of thing happening? before we get to that we need to understand where these files exist, and how they’re written. the figure below shows the three log files that are created for each task attempt in mapreduce. notice that the logs are written to the local disk of the task attempt.

parition

ok, so how does hadoop normally make sure that our disks don’t fill-up with these task attempt logs? i’ll cover three approaches.

approach 1: mapred.userlog.retain.hours

hadoop has a mapred.userlog.retain.hours configurable, which is defined in mapred-default.xml as:

the maximum time, in hours, for which the user-logs are to be retained after the job completion.

great, but what if your disks are filling up before hadoop has had a chance to automatically clean them up? it may be tempting to reduce mapred.userlog.retain.hours to a smaller value, but before you do that you should know that there’s a bug with the hadoop versions 1.x and earlier (see mapreduce-158 ), where the logs for long-running jobs that run longer than mapred.userlog.retain.hours are accidentally deleted. so maybe we should look elsewhere to solve our overflowing logs problem.

approach 2: mapred.userlog.limit.kb

hadoop has another configurable, mapred.userlog.limit.kb , which can be used to limit the file size of stdlog , which is the log4j log output file. let’s peek again at the documentation:

the maximum size of user-logs of each task in kb. 0 disables the cap.

the default value is 0 , which means that log writes go straight to the log file. so all we need to do is to set a non-negative value and we’re set, right? not so fast - it turns out that this approach has two disadvantages:

  1. hadoop and user logs are actually cached in memory, so you’re taking away mapred.userlog.limit.kb kilobytes worth of memory from your task attempt’s process.
  2. logs are only written out when the task attempt process has completed, and only contain the last mapred.userlog.limit.kb worth of log entries, so this can make it challenging to debug long-running tasks.

ok, so what else can we try? we have one more solution, log levels.

approach 3: changing log levels

ideally all your hadoop users got the memo about minimizing excessive logging. but the reality of the situation is that you have limited control over what users decide to log in their code, but what you do have control over is the task attempt log levels.

if you had a mapreduce job that was aggressively logging in package com.example.mr , then you may be tempted to use the daemonlog cli to connect to all the tasktracker daemons and change the logging to error level:

hadoop daemonlog -setlevel <host:port> com.example.mr error

yet again we hit a roadblock - this will only change the logging level for the tasktracker process, and not for the task attempt process. drat! this really only leaves one option, which is to update your ${hadoop_home}/conf/log4j.properties on all your data nodes by adding the following line to this file:

log4j.logger.com.example.mr=error

the great thing about this change is that you don’t need to restart mapreduce, since any new task attempt processes will pick up your changes to log4j.properties .

hadoop

Published at DZone with permission of Alex Holmes, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Why Open Source Is Much More Than Just a Free Tier
  • Differences Between Site Reliability Engineer vs. Software Engineer vs. Cloud Engineer vs. DevOps Engineer
  • Top Authentication Trends to Watch Out for in 2023
  • Cloud-Native Application Networking

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: