List of usage examples for org.apache.hadoop.conf Configured subclass-usage
From source file com.github.karahiyo.hadoop.mapreduce.examples.DBCountPageView.java
/**
* This is a demonstrative program, which uses DBInputFormat for reading
* the input data from a database, and DBOutputFormat for writing the data
* to the database.
* <br>
* The Program first creates the necessary tables, populates the input table
From source file com.github.karahiyo.hadoop.mapreduce.examples.Grep.java
public class Grep extends Configured implements Tool { private Grep() { } // singleton public int run(String[] args) throws Exception { if (args.length < 3) {
From source file com.github.karahiyo.hadoop.mapreduce.examples.Join.java
/**
* This is the trivial map/reduce program that does absolutely nothing
* other than use the framework to fragment and sort the input values.
*
* To run: bin/hadoop jar build/hadoop-examples.jar join
* [-m <i>maps</i>] [-r <i>reduces</i>]
From source file com.github.karahiyo.hadoop.mapreduce.examples.MultiFileWordCount.java
/** * MultiFileWordCount is an example to demonstrate the usage of * MultiFileInputFormat. This examples counts the occurrences of * words in the text files under the given input directory. */ public class MultiFileWordCount extends Configured implements Tool {
From source file com.github.karahiyo.hadoop.mapreduce.examples.PiEstimator.java
/**
* A Map-reduce program to estimate the value of Pi
* using quasi-Monte Carlo method.
*
* Mapper:
* Generate points in a unit square
From source file com.github.karahiyo.hadoop.mapreduce.examples.RandomTextWriter.java
/**
* This program uses map/reduce to just run a distributed job where there is
* no interaction between the tasks and each task writes a large unsorted
* random sequence of words.
* In order for this program to generate data for terasort with a 5-10 words
* per key and 20-100 words per value, have the following config:
From source file com.github.karahiyo.hadoop.mapreduce.examples.RandomWriter.java
/**
* This program uses map/reduce to just run a distributed job where there is
* no interaction between the tasks and each task write a large unsorted
* random binary sequence file of BytesWritable.
* In order for this program to generate data for terasort with 10-byte keys
* and 90-byte values, have the following config:
From source file com.github.karahiyo.hadoop.mapreduce.examples.SleepJob.java
/**
* Dummy class for testing MR framefork. Sleeps for a defined period
* of time in mapper and reducer. Generates fake input for map / reduce
* jobs. Note that generated number of input pairs is in the order
* of <code>numMappers * mapSleepTime / 100</code>, so the job uses
* some disk space.
From source file com.github.karahiyo.hadoop.mapreduce.examples.Sort.java
/**
* This is the trivial map/reduce program that does absolutely nothing
* other than use the framework to fragment and sort the input values.
*
* To run: bin/hadoop jar build/hadoop-examples.jar sort
* [-m <i>maps</i>] [-r <i>reduces</i>]