-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue while converting CSV to Parquet file #11
Comments
I am trying to covert CSV to Parquet, getting the above Error, not really sure why this is appearing even after Schema file being in correct format Code used is /**
import java.io.BufferedReader; import org.apache.hadoop.conf.Configuration; import parquet.Log; public class ConvertUtils { private static final Log LOG = Log.getLog(ConvertUtils.class); public static final String CSV_DELIMITER= "|"; public static void main(String[] args) throws IOException, InterruptedException {
}
} public static String getSchema(File csvFile) throws IOException { public static void convertCsvToParquet(File csvFile, File outputParquetFile) throws IOException { public static void convertCsvToParquet(File csvFile, File outputParquetFile, boolean enableDictionary) throws IOException {
} public static void convertParquetToCSV(File parquetFile, File csvOutputFile) throws IOException {
} private static void writeGroup(BufferedWriter w, Group g, MessageType schema) @deprecated
} } |
Please consider changing the fileRead method to avoid using 'line.separator' property (for Windows, it is '\r\n' while in Unix it is just '\n') . Your schema cannot be parsed successfully on Windows because MessageTypeParser reads '\r' as a type retention. |
Apr 25, 2017 2:01:43 PM parquet.Log info
INFO: Converting nation.csv to nation.parquet
Message m {
OPTIONAL int32 nation_key;
OPTIONAL binary name;
OPTIONAL int32 region_key;
OPTIONAL binary comment_col;
}
Exception in thread "main" java.lang.IllegalArgumentException: expected one of [REQUIRED, OPTIONAL, REPEATED] got
at line 0: Message m {
Caused by: java.lang.IllegalArgumentException: No enum constant parquet.schema.Type.Repetition.
The text was updated successfully, but these errors were encountered: