Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support to accept TezConfiguration in ORCFile #22

Open
shalinyp opened this issue Feb 13, 2015 · 3 comments
Open

Support to accept TezConfiguration in ORCFile #22

shalinyp opened this issue Feb 13, 2015 · 3 comments

Comments

@shalinyp
Copy link

Hi

We were testing PartitionTap for TEZ (our input/output are ORC files ) using cascading 3.0.0-wip-63 libs,Tez -0.5.3 and Cascading.hive 0.0.4 snapshot jar and encountered the following ClassCastException,

Caused by: java.lang.ClassCastException: org.apache.tez.dag.api.TezConfiguration cannot be cast to org.apache.hadoop.mapred.JobConf
at cascading.hive.ORCFile.sinkConfInit(ORCFile.java:72)
at cascading.tap.Tap.sinkConfInit(Tap.java:206)
at cascading.tap.hadoop.Hfs.sinkConfInit(Hfs.java:399)
at cascading.tap.hadoop.Hfs.sinkConfInit(Hfs.java:106)
at cascading.tap.hadoop.io.TapOutputCollector.initialize(TapOutputCollector.java:96)
at cascading.tap.hadoop.io.TapOutputCollector.(TapOutputCollector.java:91)
at cascading.tap.hadoop.PartitionTap.createTupleEntrySchemeCollector(PartitionTap.java:159)
at cascading.tap.partition.BasePartitionTap$PartitionCollector.getCollector(BasePartitionTap.java:130)
at cascading.tap.partition.BasePartitionTap$PartitionCollector.collect(BasePartitionTap.java:228)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:95)
at cascading.flow.stream.element.SinkStage.receive(SinkStage.java:98)

in the function,
public void sinkConfInit(FlowProcess flowProcess, Tap<JobConf, RecordReader, OutputCollector> tap, JobConf conf) of ORCFile of cascading.hive.

It seems that ORCFile doesnt have the support to receive TezConfiguration. Can you please check this?

Thanks.

@branky
Copy link
Owner

branky commented Feb 14, 2015

This lib only tested with Cascading 2.x. I believe there must be issues to work with Tez right now. Will you be interested to make it support Cascading 3/Tez? Your contribution will benefit whole community, thank you!

@fs111
Copy link
Collaborator

fs111 commented Feb 16, 2015

It should be fairly straight forward to support Cascading 3.x. If you run into any trouble, please let me/us know.

@branky
Copy link
Owner

branky commented Feb 17, 2015

I have made all references of JobConf to org.apache.hadoop.conf.Configuration. Code compiles but hit https://issues.apache.org/jira/browse/HIVE-6163 again, OrcOutputFormat doesn't write files with parent path which will cause sink failed. The original workaround doesn't work anymore, need to find another solution or push Hive committers to fix HIVE-6163.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants