Using too many macros causes MapReduce to fail


If you put too many macros in an application config, the mapreduce job fails with something like:

This is due to the fact that CDAP macro syntax is the same as Hadoop Configuration substitution syntax. Unfortunately, 20 is a hardcoded private variable in Hadoop, so there isn't any way to change it. Also unfortunately, though the message also complains about depth, it actually has nothing to do with depth at all and is really just substituting at max 20 variables.

All this to say that we should just use Configuration.getRaw() instead of Configuration.get() for the cdap app spec.

Release Notes

Fixed an issue that would cause MapReduce and Spark programs to fail if too many macros were being used.


Albert Shau
October 12, 2016, 6:33 PM

We could introduce an alternative syntax, though I'm not convinced that we need to change the design because of this. Users can do a lot of bad stuff if operating directly on the Hadoop Configuration, which is why most of our abstractions don't involve it.

Ali Anwar
November 16, 2016, 6:26 AM

It seems to me that this error would only occur if the macro key is also a key in the Hadoop configuration, which is unlikely.
, do you recall just arbitrary macros causing an issue?

Albert Shau
November 16, 2016, 7:44 PM

Yeah that is what I remember. I was similarly confused, but didn't dig down to find exactly what it was matching.

Albert Shau
November 16, 2016, 10:31 PM

I believe I ran into it with the following properties for the db source:

Ali Anwar
November 16, 2016, 10:35 PM

I tried on a singlenode cluster (3.5.0) as well as on a Standalone (3.5.1 via IDE), with the following pipeline and it succeeded. The stream (aaaaaa) had a single event, but that shouldn't matter.
It is using ~18 macros.



Albert Shau


Albert Shau


Docs Impact


UX Impact



Fix versions

Affects versions