MapReduce Design Patterns - External Source Output-YoLaiYoQu-ChinaUnix博客

Pattern Name	External Source Output
Category	Input and Output Patterns
Description	The external source output pattern writes data to a system outside of Hadoop and HDFS.
Intent	You want to write MapReduce output to a nonnative location.
Motivation	The pattern skips storing data in a file system entirely and sends output key/value pairs directly where they belong. MapReduce is rarely ever hosting an applications as-is, so using MapReduce to bulk load into an external source in parallel has its uses. In a MapReduce approach, the data is written out in parallel. As with using an external source for input, you need to be sure the destination system can handle the parallel ingest it is bound to endure with all the open connections.
Applicability
Structure	>The OutputFormat verifies the output specification of the job configuration prior to job submission. This method also is responsible for creating and initializing a RecordWriter implementation. >The RecordWriter writes all key/value pairs to the external source. During construction of the object, establish any needed connections using the external source’s API. These connections are then used to write out all the data from each map or reduce task.
Consequences	The output data has been sent to the external source and that external source has loaded it successfully.
Known uses
Resemblances
Performance analysis	From a MapReduce perspective, there isn’t much to worry about since the map and reduce are generic. However, you do have to be very careful that the receiver of the data can handle the parallel connections.
Examples	Writing to Redis instances