Hive中遇到过的问题

Posted by RussXia on April 27, 2019

一、MR任务在Shuffle过程时报错

Ended Job = job_1556157715460_0029 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1556157715460_0029_m_000000 (and more) from job job_1556157715460_0029

Task with the most failures(4):
-----
Task ID:
  task_1556157715460_0029_r_000000

URL:
  http://account.jetbrains.com:8088/taskdetails.jsp?jobid=job_1556157715460_0029&tipid=task_1556157715460_0029_r_000000
-----
Diagnostic Messages for this Task:
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1
	at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
	at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.checkReducerHealth(ShuffleSchedulerImpl.java:395)
	at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:310)
	at org.apache.hadoop.mapreduce.task.reduce.Fetcher.openShuffleUrl(Fetcher.java:291)
	at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:330)
	at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:198)


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   HDFS Read: 4489 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec

node节点无法从远端的其他节点获取shuffle的数据,可能的原因有很多,建议检查DNS、hosts配置等。