Package org.apache.cassandra.service
Class ActiveRepairService
- java.lang.Object
-
- org.apache.cassandra.service.ActiveRepairService
-
- All Implemented Interfaces:
IEndpointStateChangeSubscriber
,IFailureDetectionEventListener
,ActiveRepairServiceMBean
public class ActiveRepairService extends java.lang.Object implements IEndpointStateChangeSubscriber, IFailureDetectionEventListener, ActiveRepairServiceMBean
ActiveRepairService is the starting point for manual "active" repairs. Each user triggered repair will correspond to one or multiple repair session, one for each token range to repair. On repair session might repair multiple column families. For each of those column families, the repair session will request merkle trees for each replica of the range being repaired, diff those trees upon receiving them, schedule the streaming ofthe parts to repair (based on the tree diffs) and wait for all those operation. See RepairSession for more details. The creation of a repair session is done through the submitRepairSession that returns a future on the completion of that session.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ActiveRepairService.ConsistentSessions
static class
ActiveRepairService.ParentRepairSession
We keep a ParentRepairSession around for the duration of the entire repair, for example, on a 256 token vnode rf=3 cluster we would have 768 RepairSession but only one ParentRepairSession.static class
ActiveRepairService.ParentRepairStatus
static class
ActiveRepairService.RepairCommandExecutorHandle
-
Field Summary
Fields Modifier and Type Field Description ActiveRepairService.ConsistentSessions
consistent
static ActiveRepairService
instance
static java.util.UUID
NO_PENDING_REPAIR
static long
UNREPAIRED_SSTABLE
-
Fields inherited from interface org.apache.cassandra.service.ActiveRepairServiceMBean
MBEAN_NAME
-
-
Constructor Summary
Constructors Constructor Description ActiveRepairService(IFailureDetector failureDetector, Gossiper gossiper)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
abort(java.util.function.Predicate<ActiveRepairService.ParentRepairSession> predicate, java.lang.String message)
Remove any parent repair sessions matching predicatevoid
beforeChange(InetAddressAndPort endpoint, EndpointState currentState, ApplicationState newStateKey, VersionedValue newValue)
void
cleanUp(java.util.UUID parentRepairSession, java.util.Set<InetAddressAndPort> endpoints)
Send Verb.CLEANUP_MSG to the given endpoints.java.util.List<javax.management.openmbean.CompositeData>
cleanupPending(java.util.List<java.lang.String> schemaArgs, java.lang.String rangeString, boolean force)
void
convict(InetAddressAndPort ep, double phi)
Something has happened to a remote node - if that node is a coordinator, we mark the parent repair session id as failed.void
failSession(java.lang.String session, boolean force)
static EndpointsForRange
getNeighbors(java.lang.String keyspaceName, java.lang.Iterable<Range<Token>> keyspaceLocalRanges, Range<Token> toRepair, java.util.Collection<java.lang.String> dataCenters, java.util.Collection<java.lang.String> hosts)
Return all of the neighbors with whom we share the provided range.ActiveRepairService.ParentRepairSession
getParentRepairSession(java.util.UUID parentSessionId)
java.util.List<javax.management.openmbean.CompositeData>
getPendingStats(java.util.List<java.lang.String> schemaArgs, java.lang.String rangeString)
int
getRepairPendingCompactionRejectThreshold()
int
getRepairSessionSpaceInMegabytes()
java.util.List<javax.management.openmbean.CompositeData>
getRepairStats(java.util.List<java.lang.String> schemaArgs, java.lang.String rangeString)
java.util.List<java.util.Map<java.lang.String,java.lang.String>>
getSessions(boolean all, java.lang.String rangesStr)
boolean
getUseOffheapMerkleTrees()
void
handleMessage(Message<? extends RepairMessage> message)
void
onAlive(InetAddressAndPort endpoint, EndpointState state)
void
onChange(InetAddressAndPort endpoint, ApplicationState state, VersionedValue value)
void
onDead(InetAddressAndPort endpoint, EndpointState state)
void
onJoin(InetAddressAndPort endpoint, EndpointState epState)
Use to inform interested parties about the change in the state for specified endpointvoid
onRemove(InetAddressAndPort endpoint)
void
onRestart(InetAddressAndPort endpoint, EndpointState state)
Called whenever a node is restarted.int
parentRepairSessionCount()
int
parentRepairSessionsCount()
Each ongoing repair (incremental and non-incremental) is represented by aActiveRepairService.ParentRepairSession
entry in theActiveRepairService
cache.java.util.UUID
prepareForRepair(java.util.UUID parentRepairSession, InetAddressAndPort coordinator, java.util.Set<InetAddressAndPort> endpoints, RepairOption options, boolean isForcedRepair, java.util.List<ColumnFamilyStore> columnFamilyStores)
void
recordRepairStatus(int cmd, ActiveRepairService.ParentRepairStatus parentRepairStatus, java.util.List<java.lang.String> messages)
void
registerParentRepairSession(java.util.UUID parentRepairSession, InetAddressAndPort coordinator, java.util.List<ColumnFamilyStore> columnFamilyStores, java.util.Collection<Range<Token>> ranges, boolean isIncremental, long repairedAt, boolean isGlobal, PreviewKind previewKind)
ActiveRepairService.ParentRepairSession
removeParentRepairSession(java.util.UUID parentSessionId)
called when the repair session is done - either failed or anticompaction has completed clears out any snapshots created by this repairstatic java.util.concurrent.ThreadPoolExecutor
repairCommandExecutor()
int
sessionCount()
void
setRepairPendingCompactionRejectThreshold(int value)
void
setRepairSessionSpaceInMegabytes(int sizeInMegabytes)
void
setUseOffheapMerkleTrees(boolean value)
void
start()
void
stop()
RepairSession
submitRepairSession(java.util.UUID parentRepairSession, CommonRange range, java.lang.String keyspace, RepairParallelism parallelismDegree, boolean isIncremental, boolean pullRepair, PreviewKind previewKind, boolean optimiseStreams, com.google.common.util.concurrent.ListeningExecutorService executor, java.lang.String... cfnames)
Requests repairs for the given keyspace and column families.void
terminateSessions()
static boolean
verifyCompactionsPendingThreshold(java.util.UUID parentRepairSession, PreviewKind previewKind)
-
-
-
Field Detail
-
consistent
public final ActiveRepairService.ConsistentSessions consistent
-
instance
public static final ActiveRepairService instance
-
UNREPAIRED_SSTABLE
public static final long UNREPAIRED_SSTABLE
- See Also:
- Constant Field Values
-
NO_PENDING_REPAIR
public static final java.util.UUID NO_PENDING_REPAIR
-
-
Constructor Detail
-
ActiveRepairService
public ActiveRepairService(IFailureDetector failureDetector, Gossiper gossiper)
-
-
Method Detail
-
repairCommandExecutor
public static java.util.concurrent.ThreadPoolExecutor repairCommandExecutor()
-
start
public void start()
-
stop
public void stop()
-
getSessions
public java.util.List<java.util.Map<java.lang.String,java.lang.String>> getSessions(boolean all, java.lang.String rangesStr)
- Specified by:
getSessions
in interfaceActiveRepairServiceMBean
-
failSession
public void failSession(java.lang.String session, boolean force)
- Specified by:
failSession
in interfaceActiveRepairServiceMBean
-
setRepairSessionSpaceInMegabytes
public void setRepairSessionSpaceInMegabytes(int sizeInMegabytes)
- Specified by:
setRepairSessionSpaceInMegabytes
in interfaceActiveRepairServiceMBean
-
getRepairSessionSpaceInMegabytes
public int getRepairSessionSpaceInMegabytes()
- Specified by:
getRepairSessionSpaceInMegabytes
in interfaceActiveRepairServiceMBean
-
getRepairStats
public java.util.List<javax.management.openmbean.CompositeData> getRepairStats(java.util.List<java.lang.String> schemaArgs, java.lang.String rangeString)
- Specified by:
getRepairStats
in interfaceActiveRepairServiceMBean
-
getPendingStats
public java.util.List<javax.management.openmbean.CompositeData> getPendingStats(java.util.List<java.lang.String> schemaArgs, java.lang.String rangeString)
- Specified by:
getPendingStats
in interfaceActiveRepairServiceMBean
-
cleanupPending
public java.util.List<javax.management.openmbean.CompositeData> cleanupPending(java.util.List<java.lang.String> schemaArgs, java.lang.String rangeString, boolean force)
- Specified by:
cleanupPending
in interfaceActiveRepairServiceMBean
-
parentRepairSessionsCount
public int parentRepairSessionsCount()
Description copied from interface:ActiveRepairServiceMBean
Each ongoing repair (incremental and non-incremental) is represented by aActiveRepairService.ParentRepairSession
entry in theActiveRepairService
cache. Returns the current number of ongoing repairs (the current number of cached entries).- Specified by:
parentRepairSessionsCount
in interfaceActiveRepairServiceMBean
- Returns:
- current size of the internal cache holding
ActiveRepairService.ParentRepairSession
instances
-
submitRepairSession
public RepairSession submitRepairSession(java.util.UUID parentRepairSession, CommonRange range, java.lang.String keyspace, RepairParallelism parallelismDegree, boolean isIncremental, boolean pullRepair, PreviewKind previewKind, boolean optimiseStreams, com.google.common.util.concurrent.ListeningExecutorService executor, java.lang.String... cfnames)
Requests repairs for the given keyspace and column families.- Returns:
- Future for asynchronous call or null if there is no need to repair
-
getUseOffheapMerkleTrees
public boolean getUseOffheapMerkleTrees()
- Specified by:
getUseOffheapMerkleTrees
in interfaceActiveRepairServiceMBean
-
setUseOffheapMerkleTrees
public void setUseOffheapMerkleTrees(boolean value)
- Specified by:
setUseOffheapMerkleTrees
in interfaceActiveRepairServiceMBean
-
terminateSessions
public void terminateSessions()
-
recordRepairStatus
public void recordRepairStatus(int cmd, ActiveRepairService.ParentRepairStatus parentRepairStatus, java.util.List<java.lang.String> messages)
-
getNeighbors
public static EndpointsForRange getNeighbors(java.lang.String keyspaceName, java.lang.Iterable<Range<Token>> keyspaceLocalRanges, Range<Token> toRepair, java.util.Collection<java.lang.String> dataCenters, java.util.Collection<java.lang.String> hosts)
Return all of the neighbors with whom we share the provided range.- Parameters:
keyspaceName
- keyspace to repairkeyspaceLocalRanges
- local-range for given keyspaceNametoRepair
- token to repairdataCenters
- the data centers to involve in the repair- Returns:
- neighbors with whom we share the provided range
-
verifyCompactionsPendingThreshold
public static boolean verifyCompactionsPendingThreshold(java.util.UUID parentRepairSession, PreviewKind previewKind)
-
prepareForRepair
public java.util.UUID prepareForRepair(java.util.UUID parentRepairSession, InetAddressAndPort coordinator, java.util.Set<InetAddressAndPort> endpoints, RepairOption options, boolean isForcedRepair, java.util.List<ColumnFamilyStore> columnFamilyStores)
-
cleanUp
public void cleanUp(java.util.UUID parentRepairSession, java.util.Set<InetAddressAndPort> endpoints)
Send Verb.CLEANUP_MSG to the given endpoints. This results in removing parent session object from the endpoint's cache. This method does not throw an exception in case of a messaging failure.
-
registerParentRepairSession
public void registerParentRepairSession(java.util.UUID parentRepairSession, InetAddressAndPort coordinator, java.util.List<ColumnFamilyStore> columnFamilyStores, java.util.Collection<Range<Token>> ranges, boolean isIncremental, long repairedAt, boolean isGlobal, PreviewKind previewKind)
-
getParentRepairSession
public ActiveRepairService.ParentRepairSession getParentRepairSession(java.util.UUID parentSessionId)
-
removeParentRepairSession
public ActiveRepairService.ParentRepairSession removeParentRepairSession(java.util.UUID parentSessionId)
called when the repair session is done - either failed or anticompaction has completed clears out any snapshots created by this repair- Parameters:
parentSessionId
-- Returns:
-
handleMessage
public void handleMessage(Message<? extends RepairMessage> message)
-
onJoin
public void onJoin(InetAddressAndPort endpoint, EndpointState epState)
Description copied from interface:IEndpointStateChangeSubscriber
Use to inform interested parties about the change in the state for specified endpoint- Specified by:
onJoin
in interfaceIEndpointStateChangeSubscriber
- Parameters:
endpoint
- endpoint for which the state change occurred.epState
- state that actually changed for the above endpoint.
-
beforeChange
public void beforeChange(InetAddressAndPort endpoint, EndpointState currentState, ApplicationState newStateKey, VersionedValue newValue)
- Specified by:
beforeChange
in interfaceIEndpointStateChangeSubscriber
-
onChange
public void onChange(InetAddressAndPort endpoint, ApplicationState state, VersionedValue value)
- Specified by:
onChange
in interfaceIEndpointStateChangeSubscriber
-
onAlive
public void onAlive(InetAddressAndPort endpoint, EndpointState state)
- Specified by:
onAlive
in interfaceIEndpointStateChangeSubscriber
-
onDead
public void onDead(InetAddressAndPort endpoint, EndpointState state)
- Specified by:
onDead
in interfaceIEndpointStateChangeSubscriber
-
onRemove
public void onRemove(InetAddressAndPort endpoint)
- Specified by:
onRemove
in interfaceIEndpointStateChangeSubscriber
-
onRestart
public void onRestart(InetAddressAndPort endpoint, EndpointState state)
Description copied from interface:IEndpointStateChangeSubscriber
Called whenever a node is restarted. Note that there is no guarantee when that happens that the node was previously marked down. It will have only ifstate.isAlive() == false
asstate
is from before the restarted node is marked up.- Specified by:
onRestart
in interfaceIEndpointStateChangeSubscriber
-
convict
public void convict(InetAddressAndPort ep, double phi)
Something has happened to a remote node - if that node is a coordinator, we mark the parent repair session id as failed. The fail marker is kept in the map for 24h to make sure that if the coordinator does not agree that the repair failed, we need to fail the entire repair session- Specified by:
convict
in interfaceIFailureDetectionEventListener
- Parameters:
ep
- endpoint to be convictedphi
- the value of phi with with ep was convicted
-
getRepairPendingCompactionRejectThreshold
public int getRepairPendingCompactionRejectThreshold()
- Specified by:
getRepairPendingCompactionRejectThreshold
in interfaceActiveRepairServiceMBean
-
setRepairPendingCompactionRejectThreshold
public void setRepairPendingCompactionRejectThreshold(int value)
- Specified by:
setRepairPendingCompactionRejectThreshold
in interfaceActiveRepairServiceMBean
-
abort
public void abort(java.util.function.Predicate<ActiveRepairService.ParentRepairSession> predicate, java.lang.String message)
Remove any parent repair sessions matching predicate
-
parentRepairSessionCount
public int parentRepairSessionCount()
-
sessionCount
public int sessionCount()
-
-