1. Hadoop: map/reduce from HDFS stackoverflow.comI may be wrong, but all(?) examples I've seen with Apache Hadoop takes as input a file stored on the local file system (e.g. org.apache.hadoop.examples.Grep) Is there a way to load and ... |
2. CloudStore vs. HDFS stackoverflow.comDoes anyone have any familiarity with working with both CloudStore and HDFS. I am interested to see how far CloudStore has been scaled and how heavily it has been ... |
3. Writing data to Hadoop stackoverflow.comI need to write data in to Hadoop (HDFS) from external sources like a windows box. Right now I have been copying the data onto the namenode and using HDFS's put ... |
4. Where HDFS stores files locally by default? stackoverflow.comI am running hadoop with default configuration with one-node cluster, and would like to find where HDFS stores files locally. Any ideas? Thanks. |
5. Is it possible to use Avro with Hadoop 0.20? stackoverflow.comI'm interested in using Avro to save and read files from Hadoop HDFS and I saw some Jira's in Hadoop issue tracker regarding implementing support for Avro but there were no ... |
6. Is it possible to run Hadoop in Pseudo-Distributed operation without HDFS? stackoverflow.comI'm exploring the options for running a hadoop application on a local system. As with many applications the first few releases should be able to run on a single node, as long ... |
7. Which is the easiest way to combine small HDFS blocks? stackoverflow.comI'm collecting logs with Flume to the HDFS. For the test case I have small files (~300kB) because the log collecting process was scaled for the real usage. Is there any easy ... |
8. Difference between 'distcp' and 'distcp -update'? stackoverflow.comWhat is the difference between
and
Both of them would do the same work with only slight difference in how we call them. None of them overwrites an already ... |
9. hadoop copy directory stackoverflow.comIs there an hdfs API that can copy an entire local directory to the HDFS ? I found an API for copying files but is there one for directories ? |
10. File blocks on HDFS stackoverflow.comDoes Hadoop guarantee that different blocks from same file will be stored on different machines in the cluster? Obviously replicated blocks will be on different machines. |
11. Managing hdfs in psuedo-distributed hadoop mode stackoverflow.comI want to do some computation with hadoop and mahout on my quad core machine, so I am using hadoop in pseudo-distributed mode. The problem is that the space ... |
12. Hadoop, hardware and bioinformatics stackoverflow.comWe're about to buy new hardware to run our analyses and are wondering if we're making the right decisions.
The setting: |
13. How to read a file from HDFS in a non-Java client stackoverflow.comSo my MR Job generates a report file, and that file needs to be able to be downloaded by an end-user who needs to click a button on a normal web ... |
14. How ?an I be sure that data is distributed evenly across the hadoop nodes? stackoverflow.comIf I copy data from local system to HDFS, ?an I be sure that it is distributed evenly across the nodes? PS HDFS guarantee that each block will be stored at 3 ... |
15. How to store the actual name of a /*url*? stackoverflow.comI'm converting a script to HDFS (Hadoop) and I have this cmd:
With HDFS I need to get the file using ... |
16. hadoop NullPointerException stackoverflow.comI was trying to setup a multi node cluster of hadoop michael-noll's way using two computers.
When I tried to format the hdfs it showed a
|
17. Hadoop HDFS maximum file size stackoverflow.comA colleague of mine thinks that HDFS has no maximum file size, i.e., by partitioning into 128 / 256 meg chunks any file size can be stored (obviously the HDFS disk ... |
18. Moving files in Hadoop using the Java API? stackoverflow.comI want to move files around in HDFS using the Java APIs. I cannot figure out a way to do this. The FileSystem class only seems to want to ... |
19. How to keep a flat file on HDFS in sync with a large database table? stackoverflow.comWhat's the best way of keeping a flat file on HDFS in sync with a large database table which may have row updates? Tools such as sqoop seem like they'd be useful ... |
20. HDFS: Using HDFS API to append to a SequenceFile stackoverflow.comI've been trying to create and maintain a Sequence File on HDFS using the Java API without running a MapReduce job as a setup for a future MapReduce job. I ... |
21. Programmatically reading the output of Hadoop Mapreduce Program stackoverflow.comThis may be a basic question, but I could not find an answer for it on Google. |
22. Hadoop/Pig regular expression matching stackoverflow.comThis is kind of an odd situation, but I'm looking for a way to filter using something like MATCHES but on a list of unknown patterns (of unknown length). That is, if ... |
23. MapReduce shuffle/sort method stackoverflow.comSomewhat of an odd question, but does anyone know what kind of sort MapReduce uses in the sort portion of shuffle/sort? I would think merge or insertion (in keeping with ... |
24. Exception while executing hadoop job remotely stackoverflow.comI am trying to execute a Hadoop job on a remote hadoop cluster. Below is my code.
|
25. How to adapt bin/hdfs for executing from outside $HADOOP_HOME/bin? stackoverflow.comI'm trying to modify the hdfs script so that it still functions although not located in $HADOOP_HOME/bin anymore, but when I execute the modified hdfs I get:
|
26. HadoopFS (HDFS) as distributive file storage stackoverflow.comI'm consider to use HDFS as horizontal scaling file storage system for our client video hosting service. My main concern that HDFS wasn't developed for this needs this is more "an ... |
27. Hadoop fully distributed mode stackoverflow.comI am a newbie to Hadoop. I have managed to develop a simple Map/Reduce application that works fine in 'pseudo distributed mode'.I want to test that in 'fully distributed mode'. I ... |
28. Hadoop JUnit testing writing/reading to/from the hdfs stackoverflow.comI have written a class(es) that writes and reads from hdfs. Given certain conditions that are occurring when these classes are instantiated they create a specific path and file, and ... |
29. Export data from database and write to HDFS(hadoop fs) stackoverflow.comNow i am trying to export data from a db table, and write it into hdfs. And the problem is: will the name node become bottleneck? and how is the machanism, will ... |
30. Looking for overall review on Hadoop stackoverflow.comI am looking for some performance review on Hadoop (300-600 boxes cluster, commodity hardware), especially on the following aspects:
|
31. What is the maximum number of files allowed in a HDFS directory? stackoverflow.comWhat is the maximum number of files and directories allowed in a HDFS (hadoop) directory? |
32. Is it possible to append to HDFS file from multiple clients in parallel? stackoverflow.comBasically whole question is in the title. I'm wondering if it's possible to append to file located on HDFS from multiple computers simultaneously? Something like storing stream of events constantly produced ... |
33. Uploading large gzipped data files to HDFS stackoverflow.comI have a use case where I want to upload big gzipped text data files (~ 60 GB) on HDFS. My code below is taking about 2 hours to upload these files ... |
34. Why can't hadoop split up a large text file and then compress the splits using gzip? stackoverflow.comI've recently been looking into hadoop and HDFS. When you load a file into HDFS, it will normally split the file into 64MB chunks and distribute these chunks around your cluster. ... |
35. Indexing a HDFS sequence file stackoverflow.comWhat is the best library/way of indexing a very large sequence file (millions of key/value pairs where each value can be of a different length so you cannot have a random ... |
36. Trying to use Fuse to mount HDFS. Can't compile libhdfs stackoverflow.comI'm attempting to compile libhdfs (a native shared library that allows external apps to interface with hdfs). It's one of the few steps I have to take to mount Hadoop's hdfs ... |
37. Programmatic equivalent of 'hadoop fs -tail -f' stackoverflow.comI want to tail an hdfs file programmatically using the |
38. Parallel Copy to HDFS stackoverflow.comWhat is the best and fast way to achieve parallel copy to hadoop from an NFS mount? We have a mount with huge number of files and we need to copy it ... |
39. Using HierarchicalINIConfiguration class on HDFS stackoverflow.comI need to parse the ini file (this is the configuration file with sections) located on HDFS. HierarchicalINIConfiguration(File file) ... |
40. Hadoop: compress file in HDFS? stackoverflow.comI recently set up LZO compression in Hadoop. What is the easiest way to compress a file in HDFS? I want to compress a file and then delete the ... |
41. HDFS path changing when trying to update files in HDFS stackoverflow.comI am new to Hadoop and HDFS, so maybe it is something I am doing wrong when I copy from local (Ubuntu 10.04) to HDFS on a single node on localhost. ... |
42. setCompressOutput in Hadoop stackoverflow.comWhen should use and not to use
|
43. Running Hadoop MapReduce, is it possible to call external executables outside of HDFS stackoverflow.comWithin my mapper I'd like to call external software installed on the worker node outside of the HDFS. Is this possible? What is the best way to do this? I ... |
44. Does HDFS encrypt or compress the data while storing? stackoverflow.comWhen I put a file into HDFS, for example
|
45. How to check whether a file exists or not using hdfs shell commands stackoverflow.comam new to hadoop and a small help is required. Suppose if i ran the job in background using shell scripting, how do i know whether the job is completed or not. ... |
46. LeaseExpiredException: No lease error on HDFS stackoverflow.comI am trying to load large data to HDFS and I sometimes get the error below. any idea why? The error:
|
47. HDFS replication factor stackoverflow.comWhen I'm uploading a file to HDFS, if I set the replication factor to 1 then the file splits gonna reside on one single machine or the splits would be distributed ... |
48. Getting data in and out of Elastic MapReduce HDFS stackoverflow.comI've written a Hadoop program which requires a certain layout within HDFS, and which afterwards, I need to get the files out of HDFS. It works on my single-node ... |
49. hadoop api configuration on the client machine stackoverflow.comultra-noob. I have a server machine with cdh3u1 pseudo-distrib, and a client machine with a java application using the cdh3u1 API. How do I configure the client to talk to the ... |
50. Difference between hadoop fs -put and hadoop fs -copyFromLocal stackoverflow.com
|
51. How can I access hadoop via the hdfs protocol from java? stackoverflow.comI found a way to connect to hadoop via hftp, and it works fine, (read only) :
|
52. how do we compare a localfile and hdfs file for consistency stackoverflow.com
|
53. Writing to HDFS : File is overwritten stackoverflow.comI am writing to hadoop file system. But everytime I append something, it overwrites the data instead of adding it to the existing data/file. The code which is doing this is ... |
54. Under-replicated blocks count is inaccurate, buy why? stackoverflow.comI am getting wildly varying reports of under-replicated blocked. I am wondering what's causing this. hadoop dfsadmin -metasave reports ~232,000 MISSING blocks awaiting replication. How do I fix this? Jobs run ... |
55. Hadoop: Compressing output of Map-only job stackoverflow.comI have a a map-only job that outputs in TextOutputFormat. I currently see three ways of compressing my output: 1) by defining map to compress through mapred.compress.map.output.* 2) by defining output to compress through ... |
56. Using FileInputFormat.addInputPaths to recursively add HDFS path stackoverflow.comI've got a HDFS structure something like
I'm using the classic pattern of
to set my input path for a java map reduce job.
This works fine if I specify args[0] as ... |
57. how to read a file from HDFS through browser stackoverflow.comHow to provide a link a HDFS file, so that clicking on that url it will downlaod the HDFS file.. Please provide me the inputs.. Thanks MRK |
58. Need to get rid of part-m-0000* files in HDFS stackoverflow.comIn HDFS processing after each job empty files are created with names like part-m-0000*. Each of these files are empty but they are consuming 64MB of disk space because that is ... |
59. Hadoop: Performance degradation when increasing block sizes? stackoverflow.comHas anyone seen any performance degradation when increasing the block size in Hadoop? We're setting up a cluster and we're expecting a large amount of data (100s of GBs) coming in ... |
60. Compression in Hadoop Sequence File stackoverflow.comI have some basic questions about the hadoop sequential file. 1) To what extent the default compression codec compresses the file? 2) I have hadoop sequence file of 100 MB when i read ... |
61. Hadoop libhdfs test running issue - Operation not permitted stackoverflow.comI'm using Hadoop 0.20.3. When running the hdfs_test of libhdfs library, I'm getting the following errors: 1.
|