Thursday, February 23, 2017

org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.RuntimeException: Datum 1490267964939 is not in union ["null","long"]

Problem in Pig when using Store as AvroStorage():

org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.RuntimeException: Datum 1490267964939 is not in union ["null","long"]

Solution: 
1. Check for the datatypes carefully while storing the final schema
2. Most probable reason for this is that there is some null value being type casted to int or long. But in avro we always get the error for next line (which can be correct) 

In our case it was complaining Not about the actual null value but the giving error for a valid value. I think this misleading error when using avro makes it difficult to diagnose.  
for example. 
A = LOAD 'surjan/data1' using org.apache.pig.piggybank.storage.avro.AvroStorage();
B = foreach A generate date, empId;
C = DISTINCT B;
store C into 'surjan/data2' ;

If the dataset  'surjan/data1' is not present , then avro will complain saying no date found or no empId found instead of saying data does not exist or matches 0 files.


3. Using AvroStorage with index option using schema. Index option should be used when storing more than 1 datasets using avro schema

Store finalData  into 'surjan/location' USING org.apache.pig.piggybank.storage.avro.AvroStorage('index', '0','schema','{"namespace":"com.surjan.schema.myapp.avro","type":"record","name":"Mydaily jon","doc":"Avro storing with schema using Pig.","fields" ...rest of schema


org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.RuntimeException: Unsupported type in record:class java.lang.Long at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263) at org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.write(PigAvroRecordWriter.java:49) at org.apache.pig.piggybank.storage.avro.AvroStorage.putNext(AvroStorage.java:722) at 

Solutiomn: This is issue with storing single field in avro. Store 1 more dummy field and error will go.

also see this : https://issues.apache.org/jira/browse/PIG-3358