1. How do you use MapReduce/Hadoop? stackoverflow.comI'm looking for some general information about how other people are using Hadoop or other MapReduce-like technologies. In general, I am curious to whether you are writing MR applications ... |
2. Is there a .NET equivalent to Apache Hadoop? stackoverflow.comSo, I've been looking at Hadoop with keen interest, and to be honest I'm fascinated, things don't get much cooler. My only minor issue is I'm a C# developer and ... |
3. Large data - storage and query stackoverflow.comWe have a huge data of about 300 million records, which will get updated every 3-6 months.We need to query this data(continously, real time) to get some information.What are the options ... |
4. Hadoop Distribution Differences stackoverflow.comCan somebody outline the various differences between the various Hadoop Distributions available:
|
5. Error in Hadoop MapReduce stackoverflow.comWhen I run a mapreduce program using Hadoop, I get the following error. 10/01/18 10:52:48 INFO mapred.JobClient: Task Id : attempt_201001181020_0002_m_000014_0, Status : FAILED java.io.IOException: Task process exit with nonzero status of ... |
6. Run Hadoop job without using JobConf stackoverflow.comI can't find a single example of submitting a Hadoop job that does not use the deprecated JobConf class. JobClient, which hasn't been deprecated, still only supports methods that take ... |
7. Hadoop searching words from one file in another file stackoverflow.comI want to build a hadoop application which can read words from one file and search in another file. If the word exists - it has to write to one output file If ... |
8. New to Hadoop and dumbo, how to correctly sequence these operations? stackoverflow.comConsider the following log file format:
|
9. Is there anything like Hadoop in C++? stackoverflow.comWhat is the closest thing like Hadoop, but in C++? In particular, I want to do distributed computing using MapReduce. Thanks! |
10. Finding matching lines with Hadoop/MapReduce stackoverflow.comI am playing around with Hadoop and have set up a two node cluster on Ubuntu. The WordCount example runs just fine. Now I'd like to write my own MapReduce program to ... |
11. Computational Linguistics project idea using Hadoop MapReduce stackoverflow.comI need to do a project on Computational Linguistics course. Is there any interesting "linguistic" problem which is data intensive enough to work on using Hadoop map reduce. Solution or algorithm ... |
12. Project Idea with Hadoop MapReduce stackoverflow.comI learnt Hadoop a few months back and managed to do a very introductory programming project on it. I want to do a small - medium sized project or series of ... |
13. Hadoop 0.2: How to read outputs from TextOutputFormat? stackoverflow.comMy reducer class produces outputs with TextOutputFormat (the default OutputFormat given by Job). I like to consume this outputs after the MapReduce job complete to aggregate the outputs. In addition to ... |
14. Hadoop: Iterative MapReduce Performance stackoverflow.comIs it correct to say that the parallel computation with iterative MapReduce can be justified mainly when the training data size is too large for the non-parallel computation for the same ... |
15. Where do I start with distributed computing? stackoverflow.comI'm interested in learning techniques for distributed computing. As a Java developer, I'm probably willing to start with Hadoop. Could you please recommend some books/tutorials/articles to begin with? |
16. Hadoop : Code shipped from master to slave stackoverflow.comI launched a hadoop cluster and submitted a job to the master. The jar file is only contained in the master. Does hadoop ship the jar to all the slave machines ... |
17. How does Hadoop perform input splits? stackoverflow.comThis is a conceptual question involving Hadoop/HDFS. Lets say you have a file containing 1 billion lines. And for the sake of simplicity, lets consider that each line is of the ... |
18. Debugging hadoop applications stackoverflow.comI tried printing out values using System.out.println(), but they won't appear on the console. How do i print out the values in a map/reduce application for debugging purposes using Hadoop? Thanks, Deepak. |
19. Hadoop/MapReduce: Reading and writing classes generated from DDL stackoverflow.comCan someone walk me though the basic work-flow of reading and writing data with classes generated from DDL? I have defined some struct-like records using DDL. For example:
|
20. Global variables in hadoop stackoverflow.comMy program follows a iterative map/reduce approach. And it needs to stop if certain conditions are met. Is there anyway i can set a global variable that can be distributed across ... |
21. javax.security.auth.login.LoginException: Login failed stackoverflow.comI'm trying to run a hadoop job (version 18.3) on my windows machine but I get the following error:
|
22. Is there a way to configure timeout for speculative execution in Hadoop? stackoverflow.comI have hadoop job with tasks that are expected to run for significant length of fime (few minues). However hadoop starts speculative execution too soon. I do not want to turn ... |
23. Map Reduce: ChainMapper and ChainReducer stackoverflow.comI need to split my Map Reduce jar file in two jobs in order to get two different output file, one from each reducers of the two jobs. I mean that the ... |
24. Tools for optimizing scalability of an Hadoop application? stackoverflow.comI'm working with a team of mine on a small application that takes a lot of input (logfiles of a day) and produces useful output after several (now 4, in the ... |
25. Where does hadoop mapreduce framework send my System.out.print() statements ? (stdout) stackoverflow.comI want to debug a mapreduce script, and without going into much trouble tried to put some print statements in my program. But I cant seem to find them in any ... |
26. Hadoop in windows : file not found exception stackoverflow.comI'm using hadoop in windows and i've configured everything good (installing cygwin, passwordless ssh etc..) I've compiled the wordcount program in WC.jar and tried to run. Its running perfectly in standalone ... |
27. What is the computational complexity of the MapReduce overhead stackoverflow.comGiven that the complexity of the map and reduce tasks are |
28. mapreduce distance calculation in hadoop stackoverflow.comIs there a distance calculation implementation using hadoop map/reduce. I am trying to calculate a distance between a given set of points. Looking for any resources .. //edited ............ This is a very intelligent ... |
29. custom word count using hadoop stackoverflow.comI'm a beginer in hadoop.
I've understood the WordCount program. Now I have a problem. I dont want the output of all the words..
|
30. Running a standalone Hadoop application on multiple CPU cores stackoverflow.comMy team built a Java application using the Hadoop libraries to transform a bunch of input files into useful output. Given the current load a single multicore server will do fine for ... |
31. Why does DistributedCache mangle my file names stackoverflow.comI have a weird problem, DistributedCache appears to change the names of my files, it uses the original name as the parent folder and adds the file as a child. i.e. ... |
32. manupulating iterator in mapreduce stackoverflow.comI was trying to find the sum of any given points using hadoop, but my problem is on getting all values from a given key in a single reducer. It is ... |
33. MultipleOutputFormat in hadoop stackoverflow.comI'm a newbie in Hadoop. I'm trying out the Wordcount program.
Now to try out multiple output files, i use |
34. Getting started with MapReduce/Hadoop stackoverflow.comLately, i have reading a lot about MapReduce/Hadoop and think this is where industry is currently moving to. I want to start learning MapReduce/Hadoop and i thought the best way ... |
35. Hadoop Data Persistance in which format? stackoverflow.com
|
36. Efficient set operations in mapreduce stackoverflow.comI have inherited a mapreduce codebase which mainly calculates the number of unique user IDs seen over time for different ads. To me it doesn't look like it is being done ... |
37. Is "Adopting MapReduce model" = Universal answer to scalability? stackoverflow.comI have been trying to understand the MapReduce concept and apply it to my current situation. What is my situation? Well, I have an ETL tool here, in which data transformation ... |
38. Hadoop map/reduce chaining stackoverflow.comI want to chain 2 Map/Reduce jobs. I am trying to use JobControl to achieve the same. My problem is - JobControl needs org.apache.hadoop.mapred.jobcontrol.Job which in turn needs org.apache.hadoop.mapred.JobConf which is deprecated. ... |
39. Encoding image into Jpeg2000 using Distributed Computing like Hadoop stackoverflow.comJust wondering if anybody has done/aware about encoding/compressing large image into JPEG2000 format using Hadoop ? There is also this http://code.google.com/p/matsu-project/ which uses map reduce to process the image. Image size ... |
40. Expected consumption of open file descriptors in Hadoop 0.21.0 stackoverflow.comGiven Hadoop 0.21.0, what assumptions does the framework make regarding the number of open file descriptors relative to each individual map and reduce operation? Specifically, what suboperations cause Hadoop ... |
41. Using Hadoop to "bucket" data out with a single run stackoverflow.comIs it possible to use one Hadoop job run to output data to different directories based on keys? My use case is server access logs. Say I have them all together, ... |
42. Hadoop MapReduce InputFormat Deprecated? stackoverflow.comI need to implement a custom (service) input source for a Hadoop MapReduce app. I google'd and SO'd and found that one way to proceed is to implement a custom InputFormat. ... |
43. Implementation of an ArrayWritable for a custom Hadoop type stackoverflow.comHow do I define an ArrayWritable for a custom Hadoop type ? I am trying to implement an inverted index in Hadoop, with custom Hadoop types to store the data I have ... |
44. Hidden features of Hadoop MapReduce stackoverflow.comWhat are the hidden features of Hadoop MapReduce that every developer should be aware of? One hidden feature per answer, please. |
45. how to perform ETL in map/reduce stackoverflow.comhow do we design mapper/reducer if I have to transform a text file line-by-line into another text file. I wrote a simple map/reduce programs which did a small transformation but the requirement ... |
46. How can you create a file inside a hadoop map-reduce job? stackoverflow.comI searched the web, but all I found was a site that claimed that it could be done. It didn't say how. |
47. Make use of the relation name/table name/file name in Hadoop's MapReduce stackoverflow.comIs there a way to use the relation name in MapReduce's Map and Reduce? I am trying to do Set difference using Hadoop's MapReduce. Input: 2 files R and S containing list ... |
48. Mapfile as a input to a MapReduce job stackoverflow.comI recently started to use Hadoop and I have a problem while using a Mapfile as a input to a MapReduce job. The following working code, writes a simple MapFile called "TestMap" ... |
49. How to handle a datanode that dies during map/reduce stackoverflow.comWhat happens when the datanode the map/reduce is using goes down? Shouldnt the job be redirected to another datanode? How should my code handle this exceptional condition? |
50. Hadoop MapReduce throughput question stackoverflow.comI am interesting - what can be considered to be a good throughput
for the hadoop lightweight text data processing per node? |
51. How to start learning Hadoop and Mapreduce? stackoverflow.comHow to start learning Hadoop and Mapreduce? Is there any tutorial on hardware requirement and development requirement setting? I am planning to use C++ and Java. Many thanks. |
52. Objects from memory as input for Hadoop/MapReduce? stackoverflow.comI am working on the parallelization an algorithm, which roughly does the following:
|
53. Map reduce to compute SVD (Singular value decomposition) stackoverflow.comIs it possible to parallelize SVD computing, using for example Hadoop's MAP REDUCE? Could you provide a simple example of it?? |
54. MapReduce recommendation stackoverflow.comI've heard of Hadoop, but what else can I use to start in this topic...
|
55. Hadoop Map-Reduce Code fails to pick driver files libcuddpp.so stackoverflow.comGreetings to all, Today i came across a strange problem about non-root users in Linux ( CentOS ). I am able to compile & run a Java Program through below commands properly :
|
56. Distributed Profiler for hadoop / mapreduce stackoverflow.comI am looking to work on hadoop open source implementation and I was wondering if there is a distributed profiler for hadoop? In case, could someone point me to any links ... |
57. Implementing a Tree Writable class stackoverflow.comI would like to implement a TreeWritable class to represent a Tree structure. I have tried the following implementation but I'm getting a mapred.MapTask: Record too large for in-memory buffer error. How should ... |
58. How to pass agrs to mapreduce program stackoverflow.comI have to pass 3rd agrs to mapreduce program.. I have to read file given by user in mapreduce program. |
59. Static object in map/reduce stackoverflow.comI was trying to use a static object in hadoop. This object is both used in map and reduce. My program is :
|
60. All three constructors of org.apache.hadoop.mapreduce.Job are deprecated, what is the best way to construct a Job class? stackoverflow.comAll three constructors of org.apache.hadoop.mapreduce.Job are deprecated, is there a way to construct a Job class the non-deprecated way? Thanks. |
61. hadoop chain map/reduce stackoverflow.comI have chained 2 mappers followed by 1 reducer. Is it possible to write the intermediate outputs (o/p of each mapper in the chain) to HDFS? I tried setting the OutputPath ... |
62. how to import the package org.apache.hadoop.mapreduce.lib.chain in a hadoop 0.20.2 project? stackoverflow.comI'm trying to chain maps and reduces phases in one job. The problem is that I'm running under hadoop 0.20.2 and the package |
63. Linear filter (FIR) in Hadoop (Hadoop in Action exercise) stackoverflow.comExercise 4, Chapter 4 in Hadoop in Action is about implementing a linear filter computing the moving average of a time series. That is, given N and a series of timestamped ... |
64. How can I write my own Hadoop scheduler? stackoverflow.comI've been studying hadoop's scheduler mechanism recently. Using 0.20.2(fair&capacity included) Have read some papers, LATE\Deadline Scheduler... Has anyone tried? or is there a guide? thx anyway |
65. project idea for hadoop stackoverflow.comHI Im 3rd year of college student major in software engineering and had few experiences on HADOOP.i looking for a idea of small to medium size project with hadoop.i want to do ... |
66. Hadoop spiled records stackoverflow.comI couldn't find any documentation on how hadoop handles splilled records. Is there a link that can be found online. Thanks for your time. |
67. Error in using one MapReduce's output as another MapReduce's input stackoverflow.com
|
68. Hadoop and MapReduce stackoverflow.comI am new to HDFS and MapReduce and trying to calculate survey statistics. Input file is in this format: Age Points Sex Category - all 4 of them are numbers. Is ... |
69. How to re-run whole map/reduce in hadoop before job completion? stackoverflow.comI using Hadoop Map/Reduce using Java Suppose, I have completed a whole map/reduce job. Is there any way I could repeat the whole map/reduce part only, without ending the job. I mean, ... |
70. Hadoop pipes and new mapred package stackoverflow.comIs there any work going on to port Hadoop pipes from mapred to mapreduce package? Thanks, Meg |
71. How scalable is MapReduce in the original functional languages? stackoverflow.comThe Map-Reduce programming model stems from the map and reduce functions which are present in functional languages like Lisp and Scheme dating back many many years. I remember from university (early 90's) ... |
72. How do I write a Hadoop map reduce job without using deprecated classes? stackoverflow.comI know it's my OCD, but I can't stand to have a deprecated reference in my code. That said, the Hadoop tutorials, including the "The Definitive Guide" book, uses only deprecated classes ... |
73. mapreduce count example stackoverflow.comMy question is about |
74. Common examples/code of Hadoop in practice stackoverflow.comEverywhere I go to learn about Hadoop I see the |
75. running multiple map reduce in hadoop pipes stackoverflow.comI'm new to hadoop pipes. Can anyone tell me how to run two map reduce together in a single job (program) in hadoop pipes? My problem is that i want to ... |
76. Oozie running own MapReduce workflow issue stackoverflow.comNot sure if anyone has run into this issue. I am trying to use oozie for running a simple MapReduce job that searches for a string value in HDFS location and ... |
77. Hadoop 'grep' example stackoverflow.comIn Hadoop 'grep' example (that comes with the Hadoop package) what is the group parameter.Can you give me an example for that. |
78. How do I use the MultipleTextOutputFormat using the new Hadoop API? stackoverflow.comI would like to write multiple output files. How do I do this using Job instead of JobConf? |
79. Problem starting tasktracker in hadoop under windows stackoverflow.comI am trying to use hadoop under windows and I am running into a problem when I want to start tasktracker. For example:
then the logs writes:
|
80. JoGL in Hadoop? Hadoop for graphics? stackoverflow.comAfter reading this and this paper, I decided I want to implement a distributed volume rendering setup for large datasets on MapReduce as my undergraduate thesis work. ... |
81. how to get files of fixed size in map-reduce job output stackoverflow.comI have a use case where I want to process data and generate output of fixed size , say 1 GB i.e. each map-reduce job output should be 1 Gb. Does anybody ... |
82. Permutations with MapReduce stackoverflow.comIs there a way to generate permutations with MapReduce? input file:
my goal:
|
83. How could I programmatically get all the job tracker and tasktracker information that is displayed by Hadoop in the web interface? stackoverflow.comI'm using Cloudera's Hadoop distribution CDH-0.20.2CDH3u0. Is there any way I could the information such as jobtracker status, tasktracker status, counters using a JAVA program running outside of hadoop framework? I tried ... |
84. MapReduce Job not showing my print statements on the terminal stackoverflow.comI am currently trying to figure out when you run a MapReduce job what happens by making some system.out.println() at certain places on the code but know of those print statement ... |
85. Doubling each number a number of times as specify by the user stackoverflow.comI am new to hadoop and I am learning by using few examples. I am currently trying to pass a file with random integers on it. For each and every number ... |
86. Converting this code into Hadoop stackoverflow.comi want to convert the below codes to run in hadoop. Basically what I want to achieve is to runner a mapper a number of times. Assuming the array is my ... |
87. Recursive calculations using Mapreduce stackoverflow.comI am working on map reduce program and was thinking about designing computations of the form where a1,b1 are the values associated with a key
So at ... |
88. A good example in hadoop that needs iteration stackoverflow.comI am currently implement a parallel-for on hadoop to iterate the mapper a number of times as specify by the user. Can someone help me with a useful example that I ... |
89. Reading a large input files(10gb) through java program stackoverflow.comI am working with a 2 large input files of the order of 5gb each.. It is the output of Hadoop map reduce, but as i am not able to do dependency ... |
90. Maximum file size that can be processed using Hadoop in 'pseudo distributed' mode stackoverflow.comI am processing a file with 7+ million lines (~59 MB) in Ubuntu 11.04 machine with this configuration: Intel(R) Core(TM)2 Duo CPU E8135 @ 2.66GHz, 2280 MHz Memory: ... |
91. How do I make an external reference table or database available to a Hadoop MapReduce job? stackoverflow.comI am analyzing a large amount of files in a Hadoop MapReduce job, with the input files being in .txt format. Both my mapper and my reducer are written in Python. However, ... |
92. how to set the mapreduce location in hadoop? stackoverflow.comI'm new to the Apache hadoop. I install the prerequisite software and configure the every thing and eclipse plugins also done but when i click the new hadoop location it's not ... |
93. Parallel reducing with Hadoop mapreduce stackoverflow.comI'm using Hadoop's MapReduce. I have a a file as an input to the map function, the map function does something (not relevant for the question). I'd like my ... |
94. How efficient are opensource computation platform like Hadoop etc.? stackoverflow.comHow efficient are opensource distributed computation frameworks like Hadoop? By efficiency, I mean CPU cycles that can be used for the "actual job" in tasks that are mostly pure computation. In ... |
95. Hadoop mapreduce : Driver for chaining mappers within a MapReduce job stackoverflow.comI have mapreduce job: my code Mapp class: public static class MapClass extends Mapper {
|
96. Accessing .dat file from within a Jar file stackoverflow.comI am trying to access a data file from a public class, both of which are located within a JAR file. However, when I execute the jar on a Hadoop cluster, ... |
97. Problem with -libjars in hadoop stackoverflow.comI am trying to run MapReduce job on Hadoop but I am facing an error and I am not sure what is going wrong. I have to pas library jars which ... |
98. How to create and read directories in Hadoop - Mapreduce Job working directory stackoverflow.comI want to create a directory inside the working directory of a MapReduce job in Hadoop. For example by using: ... |
99. Is Hadoop going to give me more benefits in my case? stackoverflow.comI'm using Clojure to pull ten XML files hourly, each file is about 10 MB. This script is running on a server machine. |
100. Architecture and Design Document for the next generation MapReduce stackoverflow.comI would like to know the details (architecture and design documents) about the next generation Apache MapReduce. Where are the sources to get more information about it? |