Issues
- Fail early if a pipeline (or MR job) is configured with more memory than allowed by YarnCDAP-15477Bhooshan Mogal
- Pipeline failure does not show cause of failure in logsCDAP-15176Resolved issue: CDAP-15176Ali Anwar
- MapReduce jobs fail to launch on CDH 5.4CDAP-14623Resolved issue: CDAP-14623Ali Anwar
- ProgramLifecycleHttpHandlerTest can hang the buildCDAP-14150Resolved issue: CDAP-14150Andreas Neumann
- MapReduce can deadlockCDAP-14076Resolved issue: CDAP-14076Andreas Neumann
- Pipelines with many sinks may exceed the Variable substitution depth in the Hadoop confCDAP-13982Resolved issue: CDAP-13982Albert Shau
- User should be able to call MapReduceContext#addOutput without a transactionCDAP-12634Resolved issue: CDAP-12634Bhooshan Mogal
- ETLMapReduce should add error datasets within a transactionCDAP-12633Resolved issue: CDAP-12633Ali Anwar
- MapReduce programs emit too many metricsCDAP-12570Resolved issue: CDAP-12570Shankar Selvam
- PartitionConsumer onFinish should be called within the MR's transactionCDAP-12514Resolved issue: CDAP-12514Bhooshan Mogal
- Ability for MapReduce to append to an existing PFS partitionCDAP-12260Resolved issue: CDAP-12260Bhooshan Mogal
- If a MapReduce job using PartitionConsumer fails, the partitions remain in-progressCDAP-12254Resolved issue: CDAP-12254Bhooshan Mogal
- NullPointerException when making a request to get /info when the MapReduce is StartingCDAP-12251Resolved issue: CDAP-12251Sameet Sapra
- Ability for DynamicPartitioner to append to an existing PFS partitionCDAP-12084Resolved issue: CDAP-12084Ali Anwar
- CombineFileInputFormat does not work with PFS as inputCDAP-12054Resolved issue: CDAP-12054Ali Anwar
- MapReduce jobs fail to be submitted on HDP 2.6.1CDAP-11970Resolved issue: CDAP-11970Ali Anwar
- Add a configuration for the frequency of getting MapReduce task reportsCDAP-11959Resolved issue: CDAP-11959Andreas Neumann
- MapReduce status is sometimes successful even though the job failedCDAP-11937Resolved issue: CDAP-11937Andreas Neumann
- There should be a way to retrieve the job conf of a MapReduce runCDAP-11883Resolved issue: CDAP-11883Bhooshan Mogal
- There should be a way to configure Spark/MapReduce job configuration through preferencesCDAP-11882Resolved issue: CDAP-11882Bhooshan Mogal
- Mapreduce INFO logs are getting logged as WARN in CDAPCDAP-11450Bhooshan Mogal
- ReadlessIncrementTest sometimes fails due to memory limitsCDAP-11418Resolved issue: CDAP-11418Andreas Neumann
- Compatibility Modules for different Hadoop distros/versionsCDAP-8848Resolved issue: CDAP-8848NitinM
- Upgrade CDAP dependencies on Hadoop and HBase to a newer versionCDAP-8847Resolved issue: CDAP-8847NitinM
- PartitionedFileSet does not clean up files under certain MapReduce failure conditionsCDAP-8766Resolved issue: CDAP-8766Andreas Neumann
- Make file set permissions work with MultipleOutputsCDAP-8262Resolved issue: CDAP-8262Andreas Neumann
- Mapper Bytes Out is wrongCDAP-7635Resolved issue: CDAP-7635NitinM
- Readless Increments do not work from MapReduceCDAP-7624Resolved issue: CDAP-7624Andreas Neumann
- Remove all deprecated methods from Wise appCDAP-7563Resolved issue: CDAP-7563Andreas Neumann
- Better log message when a MapReduce failsCDAP-7561Resolved issue: CDAP-7561NitinM
- DynamicPartitioner should have a way to close a writer when it is known to be doneCDAP-7557Resolved issue: CDAP-7557Ali Anwar
- MultipleOutputs#close() should pass appropriate context rather than using same contextCDAP-7535Resolved issue: CDAP-7535Vinisha Shah
- MapReduce classloader is closed prematurely in MRAppMasterCDAP-7500Resolved issue: CDAP-7500Terence Yim
- MapReduce's beforeSubmit is committed, even in case of some failuresCDAP-7497Resolved issue: CDAP-7497Bhooshan Mogal
- DynamicPartitioner does not remove files upon failureCDAP-7483Resolved issue: CDAP-7483Andreas Neumann
- MapReduce should run the DatasetOutputCommitters in a separate transactionCDAP-7477Resolved issue: CDAP-7477Andreas Neumann
- Refactor MapReduceRuntimeService to configure inputs and outputs during initialize()CDAP-7476Resolved issue: CDAP-7476Andreas Neumann
- Remove deprecated setInput and setOutput methods in MapReduceConfigurerCDAP-7475Resolved issue: CDAP-7475Andreas Neumann
- If a program fails during startup, destroy() is never calledCDAP-7444Resolved issue: CDAP-7444Andreas Neumann
- MRAppMaster java process running longer than expected.CDAP-7392Resolved issue: CDAP-7392Sagar Kapare
- If a mapreduce job is configured in a workflow, it should be discovered by the application during deployCDAP-7278Resolved issue: CDAP-7278Terence Yim
- PartitionedFileSetArguments.setOutputPartitionMetadata does not work when Reducers are usedCDAP-7189Resolved issue: CDAP-7189NitinM
- Docs for CDAP-5740CDAP-7109Resolved issue: CDAP-7109Ali Anwar
- Concurrency protection for datasets and partitionsCDAP-7081Resolved issue: CDAP-7081Andreas Neumann
- MapReduce programs fail on clusters with Phoenix enabledCDAP-7030Resolved issue: CDAP-7030Poorna Chandra
- Remove unsetting of 'mapreduce.jobhistory.address' from MapReduceRuntimeServiceCDAP-6988Resolved issue: CDAP-6988Rohit Sinha
- Multiple instances of a Table in the same transaction do not see each other's writesCDAP-6579Resolved issue: CDAP-6579NitinM
- CDAP should allow rules to react to programs that run longer than expectedCDAP-6369Resolved issue: CDAP-6369Andreas Neumann
- Mapreduce jobs prints a lot of DEBUG messages for every single record that is processedCDAP-6282Resolved issue: CDAP-6282Gokul Gunasekaran
- CDAP settings used in MapReduce framework should be settable in cdap-site.xmlCDAP-6266Resolved issue: CDAP-6266NitinM
50 of 98
Fail early if a pipeline (or MR job) is configured with more memory than allowed by Yarn
Description
Release Notes
None
Pinned fields
Click on the next to a field label to start pinning.
Details
Details
Assignee
Bhooshan Mogal
Bhooshan MogalReporter
Andreas Neumann
Andreas NeumannLabels
Affects versions
Components
Fix versions
Priority
Created June 3, 2019 at 10:51 PM
Updated June 24, 2020 at 10:29 PM
Activity
Show:
Terence Yim June 3, 2019 at 11:25 PM
It's more than just the spec. We have logic to calculate the resources for each process type (driver, executor, mapper, reducer) based on the spec and runtime arguments.
Albert Shau June 3, 2019 at 10:55 PM
One possible enhancement is to pass the program's resource spec to the provisioner during the createCluster() call. This could allow the provisioner to create an appropriately sized cluster with the max container size set to an acceptable number, or fail if it cannot do so (or if the required memory is above some max memory configuration). This would allow us to fail within seconds instead of minutes, and would also reduce the amount of configuration users have to do (only need to adjust pipeline memory and not cluster memory).
If the user requests more mapper/reducer memory than allowed per container, a job can only fail. In that case, CDAP should fail early, before even trying to start the MR job, or it should reject the configuration for the mapper memory even sooner.
Otherwise it takes minutes until the user sees an error message such as:
2019-06-03 15:20:29,108 - INFO [MapReduceRunner-phase-1:i.c.c.e.b.m.ETLMapReduce@204] - Batch Run finished : status = ProgramState{status=FAILED, failureInfo='MAP capability required is more than the supported max container capability in the cluster. Killing the Job. mapResourceRequest: <memory:8192, vCores:2> maxContainerCapability:<memory:6144, vCores:32000>