1. Experience with Hadoop? stackoverflow.comHave any of you tried Hadoop? Can it be used without the distributed filesystem that goes with it, in a Share-nothing architecture? Would that make sense? I'm also interested into any performance ... |
2. Hadoop examples? stackoverflow.comI'm examining Hadoop as a possible tool with which to do some log analysis. I want to analyze several kinds of statistics in one run. Each line of my ... |
3. java.io.IOException: Job failed! when running a sample app on my osx with hadoop-0.19.1 stackoverflow.combash-3.2$ echo $JAVA_HOME /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home bash-3.2$ bin/hadoop dfs -copyFromLocal conf /user/yokkom/input2 bash-3.2$ bin/hadoop jar hadoop-*-examples.jar grep input2 output 'dfs[a-z.]+' 09/04/17 10:09:32 INFO mapred.FileInputFormat: Total input paths to process : 10 09/04/17 10:09:33 INFO mapred.JobClient: Running job: job_200904171309_0001 java.io.IOException: ... |
4. hadoop behind the scenes stackoverflow.comCan someone explain what is hadoop in terms of the ideas behind the software ? What makes it so popular and/or powerful ? |
5. getting data in and out of hadoop stackoverflow.comI need a system to analyze large log files. A friend directed me to hadoop the other day and it seems perfect for my needs. My question revolves around getting ... |
6. Java Generics & Hadoop: how to get a class variable stackoverflow.comI'm a .NET programmer doing some Hadoop work in Java and I'm kind of lost here. In Hadoop I am trying to setup a Map-Reduce job where the output key of ... |
7. Dealing with Gigabytes of Data stackoverflow.comI am going to start on with a new project. I need to deal with hundred gigs of data in a .NET application. It is very early stage now to give ... |
8. Distributing Video on a LAN to alternate Locations - Can the browser detect this? stackoverflow.comI'm the administrator for a company intranet and I'd like to start producing videos. However, we have a very small bandwidth tunnel between our locations, and I'd like to avoid hogging ... |
9. Look up values in a BDB for several files in parallel stackoverflow.comWhat is the most efficient way to look up values in a BDB for several files in parallel? If I had a Perl script which did this for one file at ... |
10. What is Hadoop? stackoverflow.comI want to know what Hadoop is ? I have gone through Google and Wikipedia but I am not clear of what actually Hadoop is and what is the goal of ... |
11. Is Hadoop right for running my simulations? stackoverflow.comhave written a stochastic simulation in Java, which loads data from a few CSV files on disk (totaling about 100MB) and writes results to another output file (not much data, just ... |
12. Can Hadoop be restricted to spare CPU cycles? stackoverflow.comIs it possible to run Hadoop so that it only uses spare CPU cycles? I.e. would it be feasible to install Hadoop on peoples work machines so that number crunching ... |
13. How to parallelize execution on remote systems stackoverflow.comWhat's a good method for assigning work to a set of remote machines? Consider an example where the task is very CPU and RAM intensive, but doesn't actually process a ... |
14. hadoop- determine if a file is being written to stackoverflow.comIs there a way to determine if a file in hadoop is being written to? eg- I have a process that puts logs into hdfs. I have another process ... |
15. Which Hadoop product is more appropriate for a quick query on a large data set? stackoverflow.comI am researching Hadoop to see which of its products suits our need for quick queries against large data sets (billions of records per set) The queries will be performed against chip ... |
16. Converting word docs to pdf using Hadoop stackoverflow.comSay if I want to convert 1000s of word files to pdf then would using Hadoop to approach this problem make sense? Would using Hadoop have any advantage over simply using ... |
17. Question on hadoop "java.lang.RuntimeException: java.lang.ClassNotFoundException: " stackoverflow.comHere's my source code
|
18. Question about using C# to talk to Hadoop FileSystem stackoverflow.comCurrently my application uses C# with MONO on Linux to communicate to local file systems (e.g. ext2, ext3). The basic operations are open a file, write/read from file and close/delete the ... |
19. Remote java program execution using ftp, very large dataset on remote machine - program to data vs data to program stackoverflow.comI am developing a java based application; its pertinent requirements are listed below
|
20. Very basic question about Hadoop and compressed input files stackoverflow.comI have started to look into Hadoop. If my understanding is right i could process a very big file and it would get split over different nodes, however if the file ... |
21. Dynamic Nodes in Hadoop stackoverflow.comIs it possible to add new nodes to Hadoop after it is started? I know that you can remove nodes (as that the master tends to keep tabs on the node ... |
22. Generating Multiple Output files with Hadoop 0.20+ stackoverflow.comI am trying to output the results of my reducer to multiple files. The data results are all contained in one file, and the rest of the results are split based ... |
23. Any tested Frameworks/Solutions similar to Apache Hadoop? stackoverflow.comI am interested in the Apache Hadoop project, but i would like to know if any other tested (please mind the 'tested') projects/frameworks are out there. Appreciate any information/links to projects similar ... |
24. Hadoop: Disadvantages of using just 2 machines? stackoverflow.comI want to do log parsing of huge amounts of data and gather analytic information. However all the data comes from external sources and I have only 2 machines to store ... |
25. Running multiple hadoop instances on same machine stackoverflow.comI wish to run a second instance of Hadoop on a machine which already has an instance of Hadoop running. After untar'ing hadoop distribution, some config files need to changed from ... |
26. What should be hadoop.tmp.dir? stackoverflow.comHadoop has configuration parameter |
27. Matching large datasets using Hadoop? stackoverflow.comI would love to get a sense if haddop is right tool for the problem I have. I'm building offline process (once a month or one a quarter) that matches 2 ... |
28. Splitting large XML files into manageble sections for Hadoop stackoverflow.comIs there a input class to deal with [multiple] large XML files based on their tree structure in Hadoop? I have a set of XML files that are of the same ... |
29. Hadoop - job statistics stackoverflow.comI used hadoop to run map-reduce applications on our cluster. The jobs take around 10 hours to complete daily. I want to know the time taken for each job, and the ... |
30. Free data warehouse - Infobright, Hadoop/Hive or what? stackoverflow.comI need to store large amount of small data objects (millions of rows per month). Once they're saved they wont change. I need to :
|
31. what is a data serialization system? stackoverflow.comaccording to Apache AVRO project, "Avro is a serialization system". By saying data serialization system, does it mean that avro is a product or api? also, I am not quit sure about ... |
32. Better to build or buy a compute grid platform? stackoverflow.comI am looking to do some quite processor-intensive brute force processing for string matching. I have run my prototype in a multi-threaded environment and compared the performance to an implementation ... |
33. Tracking Hadoop job status via web interface? (Exposing Hadoop to internal clients in the company) stackoverflow.comI want to develop a website that will allow analysts within the company to run Hadoop jobs (choose from a set of defined jobs) and see their job's status\progress. Is there an ... |
34. How to learn using Hadoop stackoverflow.comI want to learn hadoop. However, I don't have access to a cluster now. Is it possible for me to learn it and use it for writing programs and learn it ... |
35. Running Hadoop example in psuedo-distributed mode on vm stackoverflow.comI have set-up Hadoop on a OpenSuse 11.2 VM using Virtualbox.I have made the prerequisite configs. I ran this example in the Standalone mode successfully. But in psuedo-distributed mode I get ... |
36. Free Large datasets to experiment with Hadoop stackoverflow.comDo you know any large datasets to experiment with Hadoop which is free/low cost? Any pointers/links related is appreciated. Prefernce:
|
37. Classnotfound exception while running hadoop stackoverflow.comI am new to hadoop. I have a file Wordcount.java which refers hadoop.jar and stanford-parser.jar I am running the following commnad
|
38. Efficient way to store a graph for calculation in Hadoop stackoverflow.comI am currently trying to perform calculations like clustering coefficient on huge graphs with the help of Hadoop. Therefore I need an efficient way to store the graph in a way ... |
39. Which Hadoop API version should I use? stackoverflow.comIn the latest Hadoop Studio the 0.18 API of Hadoop is called "Stable" and the 0.20 API of Hadoop is called "Unstable". The distribution that comes from Yahoo is a ... |
40. getting close to real-time with hadoop stackoverflow.comI need some good references for using Hadoop for real-time systems like searching with little response time. I know hadoop has its overhead of hdfs, but whats the best way of ... |
41. Repository organization for Hadoop project stackoverflow.comI am starting on a new Hadoop project that will have multiple hadoop jobs(and hence multiple jar files). Using mercurial for source control, I was wondering what would be optimal way ... |
42. Trying to find org.apache.hadoop.io.LongWritable stackoverflow.comI'm trying to create a simple project with hadoop. I am new to IntelliJ and am trying to set the classpath to org.apache.hadoop.io. But what jar has this class? |
43. Hadoop development environment, what yours looks like? stackoverflow.com
|
44. How to merge 2 bzip2'ed files? stackoverflow.comI want to merge 2 bzip2'ed files. I tried appending one to another: |
45. Making graphs of hadoop runs stackoverflow.comOn some websites (like in this PDF : http://sortbenchmark.org/Yahoo2009.pdf) I see very nice graphs that visualize what an Hadoop cluster is doing at what moment. Were these made "manually" (i.e. ... |
46. What do you recommend for a Hadoop book? stackoverflow.comI've started getting into technology books to read. I want to learn Hadoop, and I find that I enjoy just reading books rather than staring at a computer screen ... |
47. hadoop null pointer exception stackoverflow.com
|
48. urgent Attention Required-hadoop: BufferedImage and ConvolveFilter-->JHLabs: stackoverflow.comsorry to disturb again but i like learning here. i am using JHLabs library on filters for buffered images.on running my code i am getting this exception:L
|
49. HadoopDb Java Program stackoverflow.comfirst of all thanks for showing interest. I'm Adarsh Sharma presently working on Hadoop Technologies such as Hive, Hadoop, HadoopDB , Hbase etc. I have configured HadoopDB on the Hadoop Cluster of 3 ... |
50. How to use custom pool assignment for FairScheduler in Hadoop? stackoverflow.comI am trying to take advantage of multiple pools in FairScheduler. But all my jobs are submitted by a single agent process and therefore all belong to same user. I have set ... |
51. Hadoop... Text.toString() conversion problems stackoverflow.comI'm writing a simple program for enumerating triangles in directed graphs for my project. First, for each input arc (e.g. a b, b c, c a, note: a tab symbol serves ... |
52. Hadoop begineers stackoverflow.comI'm trying to practice some data mining algorithms over hadoop. Can I do it with HDFS alone or do I need to use the sub-projects like hive/hbase/pig? Thanks, ram. |
53. Hadoop job fails when invoked by cron stackoverflow.comI have created the following shell script for invoking a hadoop job:
|
54. How to avoid OutOfMemoryException when running Hadoop? stackoverflow.comI'm running a Hadoop job over 1,5 TB of data with doing much pattern matching. I have several machines with 16GB RAM each, and I always get |
55. Hadoop block size issues stackoverflow.comI've been tasked with processing multiple terabytes worth of SCM data for my company. I set up a hadoop cluster and have a script to pull data from our SCM servers. ... |
56. Why does the Hadoop incompatible namespaceIDs issue happen? stackoverflow.comThis is a fairly well-documented error and the fix is easy, but does anyone know why Hadoop datanode NamespaceIDs can get screwed up so easily or how Hadoop assigns the NamespaceIDs ... |
57. How to convert a Hadoop Path object into a Java File object stackoverflow.comIs there a way to change a valid and existing Hadoop Path object into a useful Java File object. Is there a nice way of doing this or do I need ... |
58. How does Hadoop's RunJar method distribute class/jar files across nodes? stackoverflow.comI'm trying to use JIT compilation in clojure to generate mapper and reducer classes on the fly. However, these classes aren't being recognized by the JobClient (it's the usual ClassNotFoundException.) If I ... |
59. What does this Java Syntax mean? stackoverflow.comIn the code below, what does Iterator
|
60. hadoop inputFile as a BufferedImage stackoverflow.comSorry for my poor english. i hope you'll understand my problem. I have a question about hadoop developpment. I have to train myself on a simple image processing project using hadoop. All i want ... |
61. Hadoop ToolRunner fails with NoClassDefFoundError stackoverflow.comI am brand new to Linux, Java, and Hadoop. I have a created a simple MapReduce Driver that implements the Tool interface. But when I try to run the ... |
62. How can I run Hadoop run with a Java class? stackoverflow.comI am following the book Hadoop: the definitive Guide.
I am confused on example 3-1.
There is a Java source file, URLCat.java.
I use |
63. Libraries/Tools for Website Parsing stackoverflow.comI would like to start working with parsing large numbers of raw HTML pages into semantic data structures. Just interested in the community opinion on various available tools for such a task, ... |
64. Hadoop and 3d Rendering of images stackoverflow.comI have to make a project Distributed rendering of a 3d image. I can use standard algorithms. The aim is to learn hadoop and not image processing. So can any one ... |
65. Idle hadoop master - how to make it do some work? stackoverflow.comI have launched a small cluster of two nodes and noticed that the master stays completely idle while the slave does all the work. I was wondering what is the way ... |
66. Create a hadoop jar with external dependencies using Gradle stackoverflow.comHow do I create a hadoop jar that includes all dependencies in the lib folder using Gradle? Basically, similar to what fatjar does. |
67. When is it an overkill to use Hadoop? stackoverflow.comI have an Oracle database (roughly 1.2 billion records) of data with a web application sitting on top of it that generates queries (generates SQL code and returns counts). Basically you ... |
68. How to run a Hadoop program? stackoverflow.comI have set up Hadoop on my laptop and ran the example program given in the installation guide successfully. But, I am not able to run a program.
|
69. Problem while executing hadoop code stackoverflow.comI just started with Hadoop. I wrote a sample hadoop code as was written in the book. But still, during the time of execution exceptions arise. The snippet of what I ... |
70. Read a long string into memory stackoverflow.comI am having a very large string, and when I read it in Java, I am getting out of memory error. Actually, I need to read all this string into memory ... |
71. Distributed, error-handling, copying of TB's of data stackoverflow.comWe have a box that has terabytes of data (10-20TB) each day, where each file on the drive is anywhere from megabytes to gigabytes. We want to send all these files to ... |
72. Hadoop query regarding setJarByClass method of Job class stackoverflow.comIn the Hadoop API documentation it's given that setJarByClass public void setJarByClass(Class cls) Set the Jar by finding where a given class came from. What exactly does this explanation ... |
73. Ad Hoc Reports Hadoop stackoverflow.comHey guys, I want to allow people to put in simple text search terms, run a pig job(if that's best? it's what I know best) and output the results (the tsv file ... |
74. Running Hadoop examples halt in Pseudo-Distributed mode stackoverflow.comEvery thing run well in Standalone mode and when going to the pseudo-distributed mode, the HDFS works well, I can put files to HDFS and browse it. And I also checked ... |
75. How to compile and set up Sizzle, an open source Sawzall implementation for Hadoop, on Mac OS X? stackoverflow.com'Sizzle is an open source implementation of the Sawzall programming language designed for interoperation with the Hadoop MapReduce and DFS stack.' https://github.com/anthonyu/Sizzle |
76. Read and Write a file in hadoop in pseudo distributed mode stackoverflow.comI want to open/create a file and write some data in it in hadoop environment. The distributed file system I am using is hdfs. I want to do it in pseudo ... |
77. 1 million sentences to save in DB - removing non-relevant English words stackoverflow.comI am trying to train a Naive Bayes classifier with positive/negative words extracting from a sentiment. example: I love this movie :)) I hate when it rains :( ... |
78. Apache Hadoop : Can it do "time-varying" input? stackoverflow.comI haven't found an answer to this even after a bit of googling. My input files are generated by a process which chunks them out at say, when the file touches ... |
79. How to create the hadoop-0.21.0-core.jar using the source code? stackoverflow.comHow to create the hadoop-0.21.0-core.jar using the source code? I have check out the source code from svn. Now I have three dirs common,hdfs,mapred I want to build the hadoop-0.21.0-core.jar to run a ... |
80. EOFException thrown by a Hadoop pipes program stackoverflow.comFirst of all, I am a newbie of Hadoop. I have a small Hadoop pipes program that throws java.io.EOFException. The program takes as input a small text file and uses hadoop.pipes.java.recordreader ... |
81. How can I use multiple input files as a input file? stackoverflow.comI want to use multiple files (actually 2 files) as a input files. they are having same patterns of data. finally, I wanna get to diff datas from two input files. for example, in a ... |
82. how does netezza work? how does it compare to Hadoop? stackoverflow.comwant to understand if Netezza/Hadoop is the right choice for the below purposes: pull feed files from several online sources of considerable size at times more than a GB. clean, filter, transform and ... |
83. Why isn't Hadoop implemented using MPI? stackoverflow.comCorrect me if I'm wrong, but my understanding is that Hadoop does not use MPI for communication between different nodes. What are the technical reasons for this? I could hazard a few guesses, ... |
84. Hadoop job taking input files from multiple directories stackoverflow.com
|
85. Knowledge mining using Hadoop stackoverflow.comI want to do a project Hadoop and map reduce and present it as my graduation project. To this, I've given some thought,searched over the internet and came up with the ... |
86. Column store on top of hadoop? stackoverflow.comIs there a column store similar to Vertica that is built on top of Hadoop.. I am not talking about HBase as it is sparse matrix store and can not get ... |
87. Hadoop certification stackoverflow.comHas anyone here attended the Cloudera training and certification? How was the certification exam? Anything that would make the exam easy? |
88. Does hadoop eclipse-plugin support argument stackoverflow.comI downloaded the hadoop eclipe plug-in from this website: https://issues.apache.org/jira/browse/MAPREDUCE-1262 Thus, I can run hadoop program inside eclipe, but I don't know how to use argument in this plugin. For example jar ... |
89. Can Hadoop run on Nginx? stackoverflow.comIs that possible to run Hadoop on Nginx? if so, is there any reference? |
90. root of java installation stackoverflow.comam trying to set up apache hadoop in my system. In the procedure page it says "edit the file conf/hadoop-env.sh to define at least JAVA_HOME to be the root of your ... |
91. How do I view standard out in hadoop? stackoverflow.comI'm new to hadoop and trying to get my first non-trivial program working, and want to view standard out for debugging purposes. It's my understanding that standard out is directed into ... |
92. Why does creating a Path in hadoop cause a NullPointerException? stackoverflow.comI'm new to hadoop and trying to create a file in HDFS from within the mapper of a map-reduce job. The following code produces a NullPointerException in the last line:
|
93. Why does checking whether a file exists in hadoop cause a NullPointerException? stackoverflow.comI'm trying to create or open a file to store some output in HDFS, but I'm getting a NullPointerException when I call the |
94. In hadoop, how do I initialize the a DistributedFileSystem object via the initialize method? stackoverflow.comThere are two arguments, a URI and a Configuration. I assume that the JobConf object that the client is set to should work for Configuration, but what about the URI? Here is ... |
95. How do I append to a file in hadoop? stackoverflow.comI want to create a file in HDFS that has a bunch of lines, each generated by a different call to map. I don't care about the order of the lines, ... |
96. Are these Hadoop setup/cleanup/run times reasonable? stackoverflow.comI've set up and am testing out a pseudo-distributed Hadoop cluster (with namenode, job tracker, and task tracker/data node all on the same machine). The box I'm running on has about ... |
97. Hadoop Input files Order stackoverflow.comI have data files arranged in folders named as dates. Directory structure
|
98. Hadoop 0.21.0 java.lang.NoSuchMethodError: ProgramDriver stackoverflow.comI have a simple Hadoop Job that I sucessfully compiled and ran on Hadoop 0.20.2. Now I am compiling against Hadoop 0.21.0 which works fine but trying to run it yields ... |
99. asking about apache zookeeper stackoverflow.comHallo i am mohamad a student in masters degree I want to ask a question about Zookeeper. I read that the write operation in zookeeper to be done first the server connected ... |
100. How to contribute to apache? stackoverflow.comI am an intermediate Java learner .I want to contribute to Apache Development,I saw there is a list of Apache Projects(like Hadoop,Derby etc),I have developed certain queries which I would like ... |