1. Hadoop pig latin style guide? stackoverflow.comI'm looking to take the short cut on formatting/style for pig latin (hadoop-ay). Does anyone know where I can find a style guide? -daniel |
2. Examples of simple stats calculation with hadoop stackoverflow.comI want to extend an existing clustering algorithm to cope with very large data sets and have redesigned it in such a way that it is now computable with partitions of ... |
3. Regexp matching in pig stackoverflow.comUsing apache pig and the text
I'm trying to match "my brother just didnt do anything wrong."
Ideally, ... |
4. generating bigram combinations from grouped data in pig stackoverflow.comgiven my input data in userid,itemid format:
I'd like to generate all of the combinations(order not important) of items within each group. ... |
5. Is there a canonical problem that provably can't be aided with map/reduce? stackoverflow.comI'm trying to understand the boundaries of hadoop and map/reduce and it would help to know a non-trivial problem, or class of problems, that we know map/reduce can't assist in. It certainly ... |
6. Merging multiple files into one within Hadoop stackoverflow.comI get multiple small files into my input directory which I want to merge into a single file without using the local file system or writing mapreds. Is there a way ... |
7. What are the environment settings in Apache Pig and Hadoop Connection to run tutorial scripts? stackoverflow.comI am trying to run the pig tutorial scripts in Ubuntu for two days, however I can not manage to make pig connect to hadoop file system. It is still saying: ... |
8. pig hadoop needed for I want to do? stackoverflow.comI have a question for you, well a clarification... I developed a program that uses hadoop map reduce wich gets just a column from a dataset (csv file) and process this data ... |
9. Pig Version Mismatch (Hadoop) stackoverflow.comDid anyone has met the problem before? This is error log: Protocol org.apache.hadoop.mapred.JobSubmissionProtocol version mismatch. (client = 20, server = 21) I used pig 0.8.0 and my hadoop version is 0.20.10. I appreciate if ... |
10. How to "update" a column using pig latin stackoverflow.comImagine I have the following table available to me:
I now want to transform this, such that z is set to ... |
11. If I have a constructor that requires a path to a file, how can I "fake" that if it is packaged into a jar? stackoverflow.comThe context of this question is that I am trying to use the maxmind java api in a pig script that I have written... I do not think that knowing about ... |
12. Can PIG and HIVE be called separate programming models? stackoverflow.comThis question might sound irritating, and may not actually have anything to do with real programming. It's a spin-off of a small debate i had with a colleague of mine. He ... |
13. How do I read static files in a PIG UDF stackoverflow.comI am new to PIG and Hadoop. I have written a PIG UDF which operates on String and returns a string. I actually use a class from an already existing jar ... |
14. Hadoop Hypercube stackoverflow.comHey, i am starting a hadoop based hypercube with a flexible number of dimensions. Does anybody know any existing approaches for this? I just found PigOLAPSketch, but there is no code to ... |
15. Generate multiple outputs with Hadoop Pig stackoverflow.comI've got this file containing a list of data in Hadoop. I've build a simple Pig script which analyze the file by the id number, and so on... The last step I'm ... |
16. Hadoop PIG ouput is not split in mutliple files with PARALLEL operator stackoverflow.comLooks like that I'm missing something. Number of reducer on my data although creates that many number of files in HDFS but my data is not split into multiple files. What ... |
17. Doing analytical queries on large dynamic sets of data stackoverflow.comI have a requirement where I have large sets of incoming data into a system I own. A single unit of data in this set has a set of immutable attributes ... |
18. How to generate a custom schema from a relation in Pig? stackoverflow.comI have a schema describing tf-idf values for words in various articles. Its description looks like:
Here is an example of such data:
I want to get output in a ... |
19. How do you deal with empty or missing input files in Apache Pig? stackoverflow.comOur workflow uses an AWS elastic map reduce cluster to run series of Pig jobs to manipulate a large amount of data into aggregated reports. Unfortunately, the input data is potentially ... |
20. Running Pig query over data stored in Hive stackoverflow.com
|
21. Skip a record in LoadFunc.getNext() stackoverflow.comI'm extending the LoadFunc. In the getNext function I'd like to skip returning a tuple under certain conditions - this way I could only load a sample of the data file. ... |
22. loading an external properties file in udf stackoverflow.comWhen writing a UDF let's say a EvalFunc, is it possible to pass a configuration file with
when running in Hadoop Mode?
Best,
Will
|
23. Pig Loader for SQL queries? stackoverflow.comI'm looking for a Pig (related to Hadoop) loader to retrieve data from a SQL Server. If you've come across one, please let me know. Thanks. = Yakov |
24. run pig on hadoop could not find the result stackoverflow.comI ran a pig script on a hadoop cluster, it pass successfully but i cannot find the result files, here is what it said:
i ... |
25. Use Hadoop Pig to load data from text file w/ each record on multiple lines? stackoverflow.comI have my data file in the following format:
|
26. cant run pig with single node hadoop server stackoverflow.comI have setup a VM with ubuntu. It runs hadoop as a single node. Later I installed apache pig on it. apache pig runs great with local mode, but it always ... |
27. What are some approaches to run multiple Pig scripts sequentially? stackoverflow.comI need to run some Pig scripts sequentially in Hadoop. They must be run separately. Any suggestions? update Just a quick update that we're working toward running the Pig scripts from ... |
28. "Failed to create DataStorage" error when using Pig with Hadoop stackoverflow.comI've been trying to get Pig 0.9.0 to run using Apache Hadoop 0.20.203.0. I've looked high and low over google and mailing lists and even this question: cant run pig ... |
29. Apache Pig permissions issue stackoverflow.comI'm attempting to get Apache Pig up and running on my Hadoop cluster, and am encountering a permissions problem. Pig itself is launching and connecting to the cluster just fine- ... |
30. How to Get Pig to Work with lzo Files? stackoverflow.comSo, I've seen a couple of tutorials for this online, but each seems to say to do something different. Also, each of them doesn't seem to specify whether you're trying to ... |
31. PIG : Filter a string on the basis of a word stackoverflow.comI have a pig job where in i need to filter the data by finding a word in it, Here is the snippet
|
32. Executing Pig on another framework stackoverflow.comI understand that Pig Latin is a data flow language. In that sense it should be theoretically possible to execute Pig Latin in any framework though currently and it is meant ... |
33. Pig: Pulling individual fields out after a GROUP stackoverflow.comIn PigLatin, I want to pull the other fields out of a record I want to select because of an aggregate, such as |
34. getting started with pig stackoverflow.comThis might be a really stupid question but I'm not able to install pig properly on my machine. Pig's version is 0.9.0. I have even set my JAVA_HOME to its designated path . I've ... |
35. Java or Pig regex to strip out values from UserAgent string stackoverflow.comI need to strip out the third and subsequent values in the 'bracketed' component of the user agent string. In order to get Mozilla/4.0 (compatible; MSIE 8.0)from Mozilla/4.0 (compatible; MSIE ... |