Class CqlBulkRecordWriter

  • All Implemented Interfaces:
    org.apache.hadoop.mapred.RecordWriter<java.lang.Object,​java.util.List<java.nio.ByteBuffer>>

    public class CqlBulkRecordWriter
    extends org.apache.hadoop.mapreduce.RecordWriter<java.lang.Object,​java.util.List<java.nio.ByteBuffer>>
    implements org.apache.hadoop.mapred.RecordWriter<java.lang.Object,​java.util.List<java.nio.ByteBuffer>>
    The CqlBulkRecordWriter maps the output <key, value> pairs to a Cassandra column family. In particular, it applies the binded variables in the value to the prepared statement, which it associates with the key, and in turn the responsible endpoint.

    Furthermore, this writer groups the cql queries by the endpoint responsible for the rows being affected. This allows the cql queries to be executed in parallel, directly to a responsible endpoint.

    See Also:
    CqlBulkOutputFormat
    • Field Detail

      • BUFFER_SIZE_IN_MB

        public static final java.lang.String BUFFER_SIZE_IN_MB
        See Also:
        Constant Field Values
      • STREAM_THROTTLE_MBITS

        public static final java.lang.String STREAM_THROTTLE_MBITS
        See Also:
        Constant Field Values
      • MAX_FAILED_HOSTS

        public static final java.lang.String MAX_FAILED_HOSTS
        See Also:
        Constant Field Values
      • conf

        protected final org.apache.hadoop.conf.Configuration conf
      • maxFailures

        protected final int maxFailures
      • bufferSize

        protected final int bufferSize
      • writer

        protected java.io.Closeable writer
      • progress

        protected org.apache.hadoop.util.Progressable progress
      • context

        protected org.apache.hadoop.mapreduce.TaskAttemptContext context
    • Method Detail

      • getOutputLocation

        protected java.lang.String getOutputLocation()
                                              throws java.io.IOException
        Throws:
        java.io.IOException
      • write

        public void write​(java.lang.Object key,
                          java.util.List<java.nio.ByteBuffer> values)
                   throws java.io.IOException

        The column values must correspond to the order in which they appear in the insert stored procedure. Key is not used, so it can be null or any object.

        Specified by:
        write in interface org.apache.hadoop.mapred.RecordWriter<java.lang.Object,​java.util.List<java.nio.ByteBuffer>>
        Specified by:
        write in class org.apache.hadoop.mapreduce.RecordWriter<java.lang.Object,​java.util.List<java.nio.ByteBuffer>>
        Parameters:
        key - any object or null.
        values - the values to write.
        Throws:
        java.io.IOException
      • close

        public void close​(org.apache.hadoop.mapreduce.TaskAttemptContext context)
                   throws java.io.IOException,
                          java.lang.InterruptedException
        Specified by:
        close in class org.apache.hadoop.mapreduce.RecordWriter<java.lang.Object,​java.util.List<java.nio.ByteBuffer>>
        Throws:
        java.io.IOException
        java.lang.InterruptedException
      • close

        @Deprecated
        public void close​(org.apache.hadoop.mapred.Reporter reporter)
                   throws java.io.IOException
        Deprecated.
        Fills the deprecated RecordWriter interface for streaming.
        Specified by:
        close in interface org.apache.hadoop.mapred.RecordWriter<java.lang.Object,​java.util.List<java.nio.ByteBuffer>>
        Throws:
        java.io.IOException