mahout « hadoop « Java Database Q&A





1. Using the Apache Mahout machine learning libraries    stackoverflow.com

I've been working with the Apache Mahout machine learning libaries in my free time a bit over the past few weeks. I'm curious to hear about how others are using these ...

2. How to use Mahout in a Windows environment?    stackoverflow.com

I am trying to use Mahout in an application running on Windows. I want to build clusters from a lucene index using k-means. As soon as I have to create sequence files ...

3. Help with running Taste Grouplens demo on hadoop    stackoverflow.com

I am trying to build a collaborative filtering based Recommendation System as part of an academic project. I think Mahout project has a lot of potential and I want to use ...

4. How to execute mahout with hadoop installation    stackoverflow.com

i'm trying to figure out how to run mahout jar examples with hadoop. I configured mahout and hadoop, now i enter in the hadoop dir and type something like this: /Users/hadoop/hadoop-0.20.2/bin/hadoop jar ...

5. Classify data using Apache Mahout    stackoverflow.com

I am trying to solve a simple classification problem. The Problem:
I have a set of text and I have to categorize them based on the content. Solution using Mahout:
...

6. Is it worth purchasing Mahout in Action to get up to speed with Mahout, or are there other better sources?    stackoverflow.com

I'm currently a very casual user of Apache Mahout, and I'm considering purchasing the book Mahout in Action. Unfortunately, I'm having a really hard time getting an ...

7. How to implement SlopeOne with Hadoop? Anyone from Mahout community can help me? Sean?    stackoverflow.com

Hi there: I've successfully used a hadoop program to calculate the diff-matrix, and stored the data in my HDFS... But now I'm confusing, how can I read the users' profile as well ...

8. Deploying Mahout on hadoop cluster    stackoverflow.com

I want to run Mahout's K-Means example in a hadoop cluster of 5 machines. Which Mahout jar files should I need to keep in all the nodes, in order for the ...

9. Interpreting output from mahout clusterdumper    stackoverflow.com

I ran a clustering test on crawled pages (more than 25K docs ; personal data set). I've done a clusterdump :

$MAHOUT_HOME/bin/mahout clusterdump --seqFileDir output/clusters-1/ --output clusteranalyze.txt
The output after running cluster dumper is ...





10. What does this error tell us when I'm trying to run an example in Apache Mahout?    stackoverflow.com

I am studying to use Apache Mahout, and get the following message after running one of its example:

Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/home/user1/workspace/LDAAnalysis/output/data
In fact, the directory ...

11. In practice, how many machines do you need in order for Hadoop / MapReduce / Mahout to speed up very parallelizable computations?    stackoverflow.com

I need to do some heavy machine learning computations. I have a small number of machines idle on a LAN. How many machines would I need in order for distrubuting my ...

12. Is this a bug or a setup issue for using NewsKMeasnClustering.java    stackoverflow.com

Is this a bug or a set-up in NewsKMeansClustering.java, an example code given in chapter 9 of Mahout-in-Action? I was running this program against a directory of sequence files. The output ...

13. How to start development for mahout    stackoverflow.com

After Installation of mahout from (http://girlincomputerscience.blogspot.com/2010/11/apache-mahout.html).How to Run mahout algo and from where i can get most popular as easy tutorial for mahout freshers.... THanks in advance.

14. Generating a SequenceFile    stackoverflow.com

Given data in the following format (tag_uri image_uri image_uri image_uri ...), I need to turn them into Hadoop SequenceFile format for further processing by Mahout (e.g. clustering)

http://flickr.com/photos/tags/100commentgroup http://flickr.com/photos/34254318@N06/4019040356 http://flickr.com/photos/46857830@N03/5651576112
http://flickr.com/photos/tags/100faves http://flickr.com/photos/21207178@N07/5441742937
...
Before this ...

15. is hadoop necessary to run mahout-in-action examples?    stackoverflow.com

is hadoop necessary to run the Mahout In Action examples? i saw that there is a hadoop jar provided with mahout. i have been having problems with build-reuters.sh and was wondering ...

16. Mahout : To read a custom input file    stackoverflow.com

I was playing with Mahout and found that the FileDataModel accepts data in the format

     userId,itemId,pref(long,long,Double).
I have some data which is of the format
    ...





17. Text Mining on huge list of strings    stackoverflow.com

I have list of strings. (pretty big list of ids and strings scattered in 4-5 big files. around a GB each). These strings are formatted like this: 1,Hi 2,Hi How r u? 2,How r ...

18. Mahout LDA gives FileNotFound exception    stackoverflow.com

I created my term vectors as stated here like this:

~/Scripts/Mahout/trunk/bin/mahout seqdirectory --input /home/ben/Scripts/eipi/files --output /home/ben/Scripts/eipi/mahout_out -chunk 1
~/Scripts/Mahout/trunk/bin/mahout seq2sparse -i /home/ben/Scripts/eipi/mahout_out -o /home/ben/Scripts/eipi/termvecs -wt tf -seq
Then I run
~/Scripts/Mahout/trunk/bin/mahout lda ...

19. is it possible to use apache mahout without hadoop dependency?    stackoverflow.com

Is it possible to use Apache mahout without any dependency to Hadoop. I would like to use the mahout algorithm on a single computer by only including the mahout library inside my ...

20. Exception thrown while running K-means clustering using Mahout    stackoverflow.com

I was just trying to run K-means clustering using Mahout by following this link. However,I downloaded quickstart-kmeans.sh as directed and i learnt i had to run build-reuters.sh in ...

21. Recommendation Engine development    stackoverflow.com

I want to develop recommendation engine with hadoop , mahout and hive. hadoop for parallel operation , mahout for algorithms (logic) hive for database So how to start for it which mahout algorithm is sutable ...

22. Mahout - Naive Bayes    stackoverflow.com

I tried deploying 20- news group example with mahout, it seems working fine. Out of curiosity I would like to dig deep into the model statistics, for example: bayes-model directory contains ...

23. Continuous collaborative filtering using Mahout    stackoverflow.com

I am in the process of evaluating Mahout as a collaborative-filtering-recommendation engine. So far it looks great. We have almost 20M boolean recommendations from 12M different users. According to Mahout's ...