Download Omniture Data FileInputFormat Free Java Code
Description
An Apache Hadoop input format for Omniture daily data files (hit_ data.tsv). Works identically to TextInputFormat except for the fact that it uses a EscapedLineReader which gets around Omniture's pesky escaped tabs and newlines.
Source Files
The download file OmnitureDataFileInputFormat-master.zip has the following entries.
.classpath/* w w w .j a v a2s. c o m*/
.gitignore
README.markdown
pom.xml
src/com/tgam/hadoop/mapred/OmnitureDataFileInputFormat.java
src/com/tgam/hadoop/mapred/OmnitureDataFileRecordReader.java
src/com/tgam/hadoop/mapreduce/OmnitureDataFileInputFormat.java
src/com/tgam/hadoop/mapreduce/OmnitureDataFileRecordReader.java
src/com/tgam/hadoop/util/EscapedLineReader.java
test/com/tgam/hadoop/test/TestEscapedLineReader.java
test/com/tgam/hadoop/test/TestPageNameCount.java
test/hit_data.tsv
Download
Click the following link to download OmnitureDataFileInputFormat-master.zip.
OmnitureDataFileInputFormat-master.zip