NO.1 You are developing a MapReduce job for sales reporting. The mapper will process input keys
representing the year (IntWritable) and input values representing product indentifies (Text).
Indentify what determines the data types used by the Mapper for a given job.
A. The InputFormat used by the job determines the mapper's input key and value types.
B. The key and value types specified in the JobConf.setMapInputKeyClass and
JobConf.setMapInputValuesClass methods
C. The mapper-specification.xml file submitted with the job determine the mapper's input key and
value types.
D. The data types specified in HADOOP_MAP_DATATYPES environment variable
Answer: A

The input types fed to the mapper are controlled by the InputFormat used.
The default input format, "TextInputFormat," will load data in as (LongWritable, Text) pairs.
The long value is the byte offset of the line in the file. The Text object holds the string
contents of the line of the file.
Note: The data types emitted by the reducer are identified by setOutputKeyClass()
andsetOutputValueClass(). The data types emitted by the reducer are identified by
setOutputKeyClass() and setOutputValueClass().
By default, it is assumed that these are the output types of the mapper as well. If this is not
the case, the methods setMapOutputKeyClass() and setMapOutputValueClass() methods
of the JobConf class will override these.
Reference: Yahoo! Hadoop Tutorial, THE DRIVER METHOD

NO.2 Assuming the following Hive query executes successfully:
Which one of the following statements describes the result set?
A. A bigram of the top 80 sentences that contain the substring "you are" in the lines column of the
input data A1 table.
B. A trigram of the top 80 sentences that contain "you are" followed by a null space in the lines
column of the inputdata table.
C. An 80-value ngram of sentences that contain the words "you" or "are" in the lines column of the
inputdata table.
D. A frequency distribution of the top 80 words that follow the subsequence "you are" in the lines
column of the inputdata table.
Answer: D

Apache-Hadoop-Developer日本語試験対策   Apache-Hadoop-Developer開発   

