1. Hadoop job fails with ClassCastException stackoverflow.comI am running a job which is failing with ClassCastException at Mapper. I have tried setting the Mappers and JobConf correctly but I continue get the error. Here is my code: [1] ... |
2. Is there a good online tutorial for Hadoop development on a Windows 7 machine? stackoverflow.comI've been following the awesome Yahoo! Hadoop tutorial, which worked great for getting a virtual machine environment set up (Module 3 of the tutorial). But now I'm getting ... |
3. Hadoop Code - Git and SVN stackoverflow.comAll the Apache Hadoop Code is hosted in SVN. How does Git help in Hadoop development process? It's not clear from the below article. http://wiki.apache.org/hadoop/GitAndHadoop |
4. How to dynamic change existing files' block size in Hadoop? stackoverflow.comI have a Hadoop cluster running. I use Hadoop API to create files in Hadoop. For example using: create(Path f, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress). I ... |
5. How to exclude duplicate records from a large data feed? stackoverflow.comI have started working with a large dataset that is arriving in JSON format. Unfortunately, the service providing the data feed delivers a non-trivial number of duplicate records. On ... |
6. Sequential Files in Hadoop stackoverflow.comHow to read/parse a Sequential File written by previous Map Reduce Job. The keyOut and ValueOut of prev MR Job were Text and ByteWritable. What should be the keyin and valuein ... |
7. Initialize public static variable in Hadoop through arguments stackoverflow.comI have a problem with changing public static variables in Hadoop. I am trying to pass some values as arguments to the jar file from command line. here is my code:
|
8. Why do we need to set the output key/value class explicitly in the Hadoop program? stackoverflow.comIn the "Hadoop : The Definitive Guide" book, there is a sample program with the below code.
|
9. Hadoop job asks to disable safe node stackoverflow.comHadoop job is asking to disable safe mode manually. It says the resources are not available. How to disable safe mode? |
10. How to control file assignation in different slave in hadoop distributed system? stackoverflow.com
|
11. Is there a way to "set" Hadoop Counter instead of incrementing it? stackoverflow.comAPI only provides methods to increase a counter in Mapper or Reducer. Is there a way to just set it? or increment it's value only once irrespective of the number of ... |
12. Change default configuration on Hadoop slave nodes? stackoverflow.comCurrently I am trying to pass some values through command line arguments and then parse it using GenericOptionsParser with tool implemented. from the Master node I run something like this:
|
13. java.lang.NoClassDefFoundError when reading hadoop SequenceFile stackoverflow.comI am trying to read a
|
14. Running JNI code calling cuda code on hadoop stackoverflow.comI'm trying to use native method to call cuda code on hadoop. it loads the .so file effectively. But then in main function when I call cuda code following error occurs.
|
15. How can I inspect a Hadoop SequenceFile for which I lack full schema information? stackoverflow.comI have a compressed Hadoop SequenceFile from a customer which I'd like to inspect. I do not have full schema information at this time (which I'm working on separately). But in the ... |
16. Convert DataInput to DataInputStream? stackoverflow.comHow can I convert DataInput to DataInputStream in java? I need to know the size of the DataInput. |
17. Controlling number of lines to be written to the output file stackoverflow.comI am new to Hadoop programming.
I have a situation in which I want to stop writing |
18. Hadoop MAC OS installation woes stackoverflow.comSo I'm trying to install hadoop on MAC OS X Leopard following the steps in this note: Running Hadoop on a OS X Single Node Cluster. I reached Step 4: ... |
19. Brisk for small files stackoverflow.comI am a newbie to Cassandra and Hadoop. While looking for integration of the two products i came across Brisk. From the description i understand that Brisk replaces HDFS for CassandraFS. ... |
20. Fastest access of a file using Hadoop stackoverflow.comI need fastest access to a single file, several copies of which are stored in many systems using Hadoop. I also need to finding the ping time for each file in ... |
21. Building hadoop using ant stackoverflow.comI tried to build hadoop-mapreduce-project using ant.I tried with maven it suceeded but i need to build it with ant. OR is their any alternative of "ant compile-mapred-test" in maven build? ... |
22. Does java api for hadoop writing require SSH? stackoverflow.comHi guys : Im trying to setup writes to a remote, single node hadoop instance (remote in that its running on my box in a VM).... However Im getting ... |
23. How to config Solr with hadoop? stackoverflow.comHow can I configure solr with Hadoop? Do I only need to put the data folder inside Hadoop? |
24. How to uninstall Hadoop? stackoverflow.comI am using Mac OSX and want to uninstall/re-install(clean) hadoop Please let me know how can I do that Thank you |
25. hadoop split file in equally size stackoverflow.comIm trying to learn diving a file stored in hdfs into splits and reading it to different process (on different machines.)
What I expect is if I have a |
26. How to use Hadoop API copyMerge function? What is the addString parameter? stackoverflow.comDoes anyone know or have used copyMerge function in Hadoop API - FileUtil?
In the function, what is the ... |
27. Variants of Hadoop stackoverflow.comA project of mine is to compare different variants of Hadoop, it is said that there are many of them out there, but googling didn't work well for me :( Does anyone ... |
28. How to make Hadoop use all the cores on my system? stackoverflow.comI have a 32 core system. When I run a MapReduce job using Hadoop I never see the java process use more than 150% CPU (according to top) and it usually ... |
29. How to overwrite/reuse the exisitng output path for hadoop Job's again and agin overwrite stackoverflow.comI want to overwrite/reuse the existing output directory when i will run my Hadoop Job daily. Actually the output directory will store summarized output of each days Job run result's. If I specify ... |
30. Hadoop outputCollector stackoverflow.comI have a mapreduce program and is working fine, following are the signatures of map and reduce functions. The outputcollector presently is
I need to ... |
31. Hadoop & Bash: delete filenames matching range stackoverflow.comSay you have a list of files in HDFS with a common prefix and an incrementing suffix. For example,
I only want to leave a few file in ... |
32. Using GCJ to compile Hadoop RandomWriter stackoverflow.comI'm trying to compile a gcj version of hadoop's randomwriter It successfully compiles, but when I try to run the resulting executable I get the following output:
|
33. SortByTemperatureUsingHashPartitioner NullPointerException stackoverflow.comHas anybody successfully run the SortByTemperatureUsingHashPartitioner from "Hadoop The Definitive Guide." book ? Mine crashed. Does anyone know why?
|
34. Struggling with scripting stackoverflow.comNot to much experience with writing shell scripts but I have to write a script to run a java program on a cloud using hadoop. I have 2 scripts called ... |
35. Kerberos with Hadoop, error: avax.security.sasl.SaslException: GSS initiate failed stackoverflow.comI configured kerberos to work with hadoop, since I use cloudera CDH3, so I configured according to the guideline of cloudera. (Kerberos version is 1.8.4) All nodes can startup normally, but ... |
36. Apache Hadoop - Excluding files when corrupt stackoverflow.comI process several server logfiles (around 40) and collect a bunch of metrics using Apache Hadoop. If one or more of those files are inconsistent or corrupted, I would like to ... |
37. hadoop master node slave node datanode stackoverflow.comI am Riyas and new in hadoop. if a master node goes down what happened to the cluster? Any slave node can act as a master? Is it need any additional ... |
38. What is an RPC port and how is it relevant to connecting to Hadoop? stackoverflow.comIm not much of a networking type. Im trying to understand how to debug a hadoop connection - and the connection relies on an RPC port. Any insights into ... |
39. Hadoop Pipes cannot find shared libraries stackoverflow.comI am getting this error while running a hadoop pipes program. The program compiles successfully but fails on hadoop pipes.
|
40. not able to communicate with the client using ssh stackoverflow.comI am trying to setup a Hadoop cluster but i am unable to access the slave machine using ssh, though i am able to ssh to the localhost.i have tried the ... |
41. Hadoop Hello World Example And Introduction stackoverflow.comI've been hearing a lot about Apache Hadoop as an awesome way to do processing intensive taks. Looking for a really basic introduction to Hadoop. Like the |
42. Specifying memory limits with hadoop stackoverflow.comI am trying to run a high-memory job on a Hadoop cluster (0.20.203). I modified the mapred-site.xml to enforce some memory limits.
|
43. Will hadoop support multiple threads in local mode? stackoverflow.comWhen running multiple threads in hadoop in parallel, some jobs fail randomly. Also there are exceptions like ChecksumException and SaxParserException(Premature end of file). Tried many ways to fix these but couldn't ... |
44. Turn off replication only for Hadoop job output stackoverflow.comIs there a way to set the replication factor for the output of a specific MapReduce job to be different than the rest of the cluster (say 1)? I'd like my ... |
45. Error while svn checkout http://svn.apache.org/repos/asf/hadoop/common/trunk stackoverflow.comI am trying to install hadoop on my ubuntu box, but enounter the below error while check out : svn[options] could not connect to server http://svn.apache.orgAny idea why ... |
46. Hadoop: How to compile libhdfs.so? stackoverflow.comWe are using Hadoop through the Hadoop C/C++ API (libhdfs.so). We use the latest stable Hadoop version which is 0.20.203. Unfortunately, there are no clear (and up to date) instructions to ... |
47. Deploying custom MBeans to Hadoop stackoverflow.comI'm starting development of a Hadoop application and I'd like to manage it via a couple of |
48. Localhost-only pseudo-distributed hadoop installation stackoverflow.comI am trying to make a pseudo-distributed Hadoop installation on my Gentoo machine. I want nothing to be visible from the outside network - e.g. jobtracker and namenode web interfaces - ... |
49. Hadoop read multiple lines at a time stackoverflow.comI have a file in which a set of every four lines represents a record. eg, first four lines represent record1, next four represent record 2 and so on.. How can I ensure ... |
50. NLinesInputFormat Alternative in Hadoop 0.20? stackoverflow.comI am working with Hadoop 0.20, and wish to use the NLinesInputFormat, but this functionality isn't present? Is there an alternative? Here's what I'm trying to do: Records in the data span multiple lines, ... |
51. Restrict number of concurrent reducers per user stackoverflow.comIs there a way to restrict the number of concurrent reduce slots per user in hadoop? We want to ensure no single user is using up all available reduce slots at ... |
52. How to get file size stackoverflow.comI am running a hadoop job, I have FileSystem object and Path object and I want to know what is the file (Path) size. any idea? |
53. Understanding Hadoop Simulator Mumak stackoverflow.comRecently I was trying to understand the working of Mumak (see, e.g., MAPREDUCE-728) It basically takes a job trace and topology trace and simulates hadoop. I couldn't understand how it assigns ... |
54. Documentation Generator for Big Data Analytics stackoverflow.comI am wondering what tools do people use for generating documentation for Big Data analytics. By that I mean aggregating, ranking, clustering, etc. multi-terabyte data sets using things such as Hadoop, ... |
55. Hadoop and analytics? stackoverflow.comI'm in the process of building a complete 'scale-out'able solution to provide in-depth realtime analytics to our customers. The customers mainly have up to 200 servers, each having at most 400 sessions ... |
56. Hadoop word count example fails with 'not a SequentialFile'. How set file format? stackoverflow.comI'm trying to run |
57. How to use toArray() method in ArrayWritable - Hadoop stackoverflow.comThere is a
So how should we ... |
58. Neural Network training in parallel, better to use Hadoop or a gpu? stackoverflow.comI need to train a neural network with 2-4 hidden layers, not sure yet on the structure of the actual net. I was thinking to train it using Hadoop map reduce ... |
59. .Net and Hadoop - What to know / learn and what is available? stackoverflow.comInformationMy question is regarding BigData in .Net. BigData is used to store and query huge ammounts of data (Facebook, Google, Twitter, ...). Examples of BigData are MapReduce, Hadoop, Dryad, ... Microsoft dropped ... |
60. Why do Column oriented databases such as Vertica/InfoBright/GreenPlum make a fuss of Hadoop? stackoverflow.comWhat is the point in feeding an Hadoop cluster and using that cluster to feed data into a Vertica/InfoBright datawarehouse ? All thse vendor keep saying "we can connect with Hadoop", but ... |
61. Error running Hadoop pipes Program: "Server failed to authenticate" stackoverflow.comWhile trying to run a C++ program referring this ( link ) on my hadoop cluster. I got the error mentioned below. I referred related posts (this) regarding this ... |
62. Hadoop Global Property Conf.Set / Conf.Get in Cleanup()? stackoverflow.comI am trying to use Global Variables in Hadoop via the Conf.set() and Context.getConfiguration().get() methods. However, these don't seem to be working inside a Cleanup method I'm using - Though I am ... |
63. Configuring a slave's hostname using internal IP - Multiple NICs stackoverflow.comIn my Hadoop environment, I need to configure my slave nodes so that when they communicate in the middle of a map/reduce job they use the internal IP instead of the ... |
64. Hadoop: How to unit test FileSystem stackoverflow.comI want to run unit test but I need to have a org.apache.hadoop.fs.FileSystem instance. Are there any mock or any other solution for creating FileSystem? |
65. Hadoop and compression coderanch.comHi all I am pretty new to the HDFS and was looking for some opinions on some conflicting answers I have recently gotten. 1. Is it a good idea to compress the stream to write the file out to hadoop. One person told me they had got 10x benefit from doing this. Another told me that it was bad to compress ... |
66. Hadoop in the cloud coderanch.comI checked only the possiblity to use Hadoop on the cloud and I found some ec2 scripts which handles instance startups. I'm not sure if it is possible to increase the size of a cluster dinamically. Currently I see some static configuration files which controls the number of nodes in the cluster. Since the pricing model of EC2 instances are hourly ... |
67. Hadoop Rocks coderanch.comHi, We have been using Hadoop from past 6 months. It has changed the way we think programming and not to forget the immense performance improvements. Few queries to Chuck, Which Hadoop distribution you would be targeting 0.20.2 ? Do you also cover Unit testing for Map reduce programs ?. - This is one area where not much information and guidelines ... |
68. Data stores used in Hadoop in Action coderanch.com |
69. Why Hadoop needs its own file system? coderanch.comHadoop provides many interfaces to its filesystems, and it generally uses the URI scheme to pick the correct filesystem instance to communicate with. Although it is possible (and sometimes very convenient) to run MapReduce programs that access any of these filesystems, when you are processing large volumes of data, you should choose a distributed filesystem that has the data locality optimization, ... |
70. Hadoop in Mac coderanch.comIt's definitely possible to install Hadoop on a Mac. In fact, almost every developer you see in a Hadoop conference is carrying a Mac :P To be more specific, Hadoop is targeted for running on Unix and has several modes of operation. In production ("fully distributed mode"), it runs on a cluster of Unix machines, which are usually cheap Linux boxes. ... |
71. Hadoop usage examples coderanch.comHadoop is targeted for developing programs to process large data sets. It's useful whenever you have a lot of data to process or analyze. The first Hadoop application for many web companies is to analyze log data. For example, you can look at log data to see how many unique viewers you have and where do they tend to come from. ... |
72. What do you think will be the main research focus for hadoop in future? coderanch.com |
73. new in hadoop coderanch.comHi, I just ever heard about hadoop,I read sample chapter from hadoop in action made me interested, I've some questions: 1. is it extendable framework ? 2. are there any other similar framework ? if yes, how's the comparation of their performance? 3. can it run program created with other language than java ? |
74. Hadoop with Drools coderanch.com |
75. Is HADOOP complicated? coderanch.comYes. I wrote the book because I heard the same frustrations from many people. Hadoop has a steep learning curve not because it's complicated, but because it's novel. Also, like many open source projects, a lot of the documentation are organized for reference rather than for learning. I intend my book for the general Java programmer with no background in distributed ... |
76. Hadoop Architecture coderanch.com |
77. Hadoop in enterprise coderanch.comSearch engines is about retrieval. Hadoop with their MapReduce algorithm framework is about data processing. Every search engine has a data processing requirement until the data is indexed etc. Really big search engines needs really big data processing frameworks. Hadoop is the one. But the category of data processing doesn not reduce to search index processing, but there are plenty of ... |
78. Hadoop - mean time to productivity coderanch.comI've seen a number of courses in universities where students are expected to get up to speed on Hadoop in about 2-4 weeks. My memory is a bit vague on this one, but I do remember somewhere that a mid-term homework assignment was to implement PageRank over Wikipedia articles using Hadoop. I would certainly consider that a "comfortable" level. Of course, ... |
79. Usage of Hadoop coderanch.com |
80. Hadoop testing/deployment/learning on your own coderanch.comAs someone who doesn't use Hadoop, at least not yet, it seems to me that to really get a feel for setting up, managing, and testing an implementation of Hadoop you need to have a multiple machine setup. You can't mimic real world use cases if you're running it on one machine. Arguably it's not even helpful to set it up ... |
81. what's Hadoop ? coderanch.com |
82. * Winners: Hadoop in Action coderanch.com |