1. Should map() and reduce() return key/value pairs of same type? stackoverflow.comWhen writing a MapReduce job (specifically Hadoop if relevant), one must define a |
2. Hadoop one Map and multiple Reduce stackoverflow.comWe have a large dataset to analyze with multiple reduce functions. All reduce algorithm work on the same dataset generated by the same map function. Reading the large dataset costs too much ... |
3. Hadoop Reduce Error stackoverflow.comI keep getting Exceeded MAX_FAILED_UNIQUE_FETCHES; on the reduce phase even though I tried all the solutions I could find online. Please help me, I have a project presentation in three ... |
4. What's the easiest way to explain What is Hadoop and Map/Reduce? stackoverflow.comIt's very easy to explain NoSQL from high level view - it is basically "key-value" storage. Of course with thousand minor and important things, but in general it's just key value ... |
5. Hadoop: How to find out the partition_Id in reduce step using Context object stackoverflow.comIn Hadoop API ver. 0.20 and above the Context object was introduced instead JobConf. I need to find out using Context object 1) the partition_id for current Reducer 2) the output folder Using ... |
6. Hadoop Spill failure stackoverflow.comI'am currently working on a project using Hadoop 0.21.0, 985326 and a cluster of 6 worker nodes and a head node. Submitting a regular mapreduce job fails, but I have no idea ... |
7. Using Hadoop for the First Time, MapReduce Job does not run Reduce Phase stackoverflow.comI wrote a simple map reduce job that would read in data from the DFS and run a simple algorithm on it. When trying to debug it I decided to simply ... |
8. Hadoop counters: how to access the Reporter object outside map() and reduce() stackoverflow.comTo use counters I need to have an access to Reporter object. The Reporter object is passed as parameter to map() and reduce(), hence I can do: reporter.incrCounter(NUM_RECORDS, 1); But I need ... |
9. What is the maximum number of records that a hadoop reducer's reduce() call can take? stackoverflow.comI have a mapper whose output is mapped to multiple different reducer instances by using my own Partitioner. My partitioner makes sure that a given is sent always to a given ... |
10. Hadoop Map Reduce Program stackoverflow.comWhen I was trying the Map Reduce programming example from Hadoop in Action book based on Hadoop 0.20 API I got the error java.io.IOException: Type mismatch in value from map: expected ... |
11. Write arbitrary map and reduce function stackoverflow.comI want to write my own map and reduce function in mapreduce framework How can I do that??(my programming language is java) Thanks. |
12. Separating Hadoop Map and Reduce tasks stackoverflow.comIn a 3 node hadoop cluster. I would like the master to be 1 node. Map task taking place in one node and reduce tasks in 1 node. Map and reduce ... |
13. merge output files after reduce phase stackoverflow.comIn mapreduce each reduce task write its output to a file named part-nnnnn where nnnnn is partirion ID associated with the reduce task, does map/red merge these files?? if yes, how?? ... |
14. Compare and join two datasets using CompositeInputFormat in hadoop map/reduce stackoverflow.comI have a question regarding Joins in Map/Reduce. If I want to do inner join in hadoop Map/Reduce how would I do it. I have heard of CompositeInputFormat but haven't found much ... |
15. HADOOP: emitting a Matrix from a mapper stackoverflow.comHI everyone I am new to hadoop map reduce, i wanted to know that there is some outputformat type which can allow me to emit a matrix(2d array) directly from the mapper ... |
16. Hadoop , Map reduce chainig stackoverflow.comI have to implement the following map-->Reduce1-->Reduce 2 means the Reduce2 is a separate operation on output of Reduce 1. I want to get the values emitted by reduce 1 and ... |
17. How to save only non empty reducers' output in HDFS stackoverflow.comIn my application the reducer saves all the part files in HDFS but I want only the reducer will write the part files whose sizes are not 0bytes.Please let me know ... |
18. Hadoop: Set slave as explicit reducer? stackoverflow.comwe use a hadoop multi-node setup on debian + ubuntu with the latest stable hadoop release. is it possible to set a specific slave to be the reducer? i just use ... |
19. Implementing third phase called merge after Reduce phase stackoverflow.comI need to add a third phase – merge – which combines the outputs of separate, parallel Reduce tasks.This makes it possible to do things like joins and build cartesian products.Can ... |
20. A hadoop job complete without map and reduce on a Hadoop Cluster( one namenode ,12 datanode) stackoverflow.comdescriptionI wrote a hadoop program and ran it on a single machine ,it worked good. But it encountered below problems(job didn't start and finished immediately after map start) when I migrated it ... |
21. From "reduce input records" to "reduce input groups" stackoverflow.comAfter runing a MapRed job, we will get some summary about the job, for example:
I knows this is caused by combine repeated keys. My question ... |
22. Hadoop reduce task gets hung stackoverflow.comI set up a hadoop cluster with 4 nodes, When running a map-reduce task, the map task finishes quickly, while the reduce task hangs at 27% percent. I checked the log, ... |
23. Setting the number of map tasks and reduce tasks stackoverflow.comI am currently running a job I fixed the number of map task to 20 but and getting a higher number. I also set the reduce task to zero but I ... |
24. Why all the reduce tasks are ending up in a single machine? stackoverflow.comI wrote a relatively simple map-reduce program in Hadoop platform (cloudera distribution). Each Map & Reduce write some diagnostic information to standard ouput besides the regular map-reduce tasks. However when I'm ... |
25. How to deal with unbalanced input of reduce task? stackoverflow.comRecently I was asked how to deal with unbalanced input of reduce task. I thought for while and try to redistribute the data, but didn't come up with a good solution. ... |
26. Hadoop: Why might a furiously writing reduce task be timed out? stackoverflow.comI have a Hadoop reduce task that reads its input records in batches and does a lot of processing and writes a lot of output for each input batch. I ... |
27. How to pass objects from Client to Map and Reduce? stackoverflow.comIs that the class should extend ObjectWritable class? Then how can I pass it from client to the Map and Reduce? thanks |
28. in Map/Reduce , could only reduce be restarted? stackoverflow.comis it possible to restart only reduce job in map/reduce job? my guess is 'No' but just want to see if someone has other thoughts about it Thanks |
29. Is there a way to access number of successful map tasks from a reduce task in an MR job? stackoverflow.comIn my Hadoop reducers, I need to know how many successful map tasks were executed in the current job. I've come up with the following, which as far as I ... |