|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectjavax.management.Attribute
org.archive.crawler.settings.Type
org.archive.crawler.settings.ComplexType
org.archive.crawler.settings.ModuleType
org.archive.crawler.framework.Processor
org.archive.crawler.processor.recrawl.PersistProcessor
org.archive.crawler.processor.recrawl.PersistOnlineProcessor
org.archive.crawler.processor.recrawl.PersistStoreProcessor
public class PersistStoreProcessor
Store CrawlURI attributes from latest fetch to persistent storage for consultation by a later recrawl.
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType |
|---|
ComplexType.MBeanAttributeInfoIterator |
| Field Summary |
|---|
| Fields inherited from class org.archive.crawler.processor.recrawl.PersistOnlineProcessor |
|---|
historyDb, store |
| Fields inherited from class org.archive.crawler.processor.recrawl.PersistProcessor |
|---|
URI_HISTORY_DBNAME |
| Fields inherited from class org.archive.crawler.framework.Processor |
|---|
ATTR_DECIDE_RULES, ATTR_ENABLED, attrDecideRules |
| Fields inherited from class org.archive.crawler.settings.ComplexType |
|---|
definition, definitionMap |
| Constructor Summary | |
|---|---|
PersistStoreProcessor(java.lang.String name)
Usual constructor |
|
| Method Summary | |
|---|---|
void |
crawlCheckpoint(java.io.File checkpointDir)
Called by CrawlController when checkpointing. |
void |
crawlEnded(java.lang.String sExitMessage)
Called when a CrawlController has ended a crawl and is about to exit. |
void |
crawlEnding(java.lang.String sExitMessage)
Called when a CrawlController is ending a crawl (for any reason) |
void |
crawlPaused(java.lang.String statusMessage)
Called when a CrawlController is actually paused (all threads are idle). |
void |
crawlPausing(java.lang.String statusMessage)
Called when a CrawlController is going to be paused. |
void |
crawlResuming(java.lang.String statusMessage)
Called when a CrawlController is resuming a crawl that had been paused. |
void |
crawlStarted(java.lang.String message)
Called on crawl start. |
protected void |
initialTasks()
Classes subclassing this one should override this method to perform processor specific actions. |
protected void |
innerProcess(CrawlURI curi)
Classes subclassing this one should override this method to perform their custom actions on the CrawlURI. |
| Methods inherited from class org.archive.crawler.processor.recrawl.PersistOnlineProcessor |
|---|
finalTasks, initStore |
| Methods inherited from class org.archive.crawler.processor.recrawl.PersistProcessor |
|---|
copyPersistSourceToHistoryMap, historyDatabaseConfig, main, persistKeyFor, populatePersistEnv, shouldLoad, shouldStore |
| Methods inherited from class org.archive.crawler.framework.Processor |
|---|
checkForInterrupt, getController, getDecideRule, getDefaultNextProcessor, innerRejectProcess, isContentToProcess, isExpectedMimeType, isHttpTransactionContentToProcess, kickUpdate, process, report, rulesAccept, rulesAccept, setDefaultNextProcessor, spawn |
| Methods inherited from class org.archive.crawler.settings.ModuleType |
|---|
addElement, listUsedFiles |
| Methods inherited from class org.archive.crawler.settings.Type |
|---|
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient |
| Methods inherited from class javax.management.Attribute |
|---|
getName |
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public PersistStoreProcessor(java.lang.String name)
name - | Method Detail |
|---|
protected void initialTasks()
ProcessorThis method is garanteed to be called after the crawl is set up, but before any URI-processing has occured.
initialTasks in class PersistOnlineProcessor
protected void innerProcess(CrawlURI curi)
throws java.lang.InterruptedException
Processor
innerProcess in class Processorcuri - The CrawlURI being processed.
java.lang.InterruptedException
public void crawlCheckpoint(java.io.File checkpointDir)
throws java.lang.Exception
CrawlStatusListenerCrawlController when checkpointing.
crawlCheckpoint in interface CrawlStatusListenercheckpointDir - Checkpoint dir. Write checkpoint state here.
java.lang.Exception - A fatal exception. Any exceptions
that are let out of this checkpoint are assumed fatal
and terminate further checkpoint processing.public void crawlEnded(java.lang.String sExitMessage)
CrawlStatusListener
crawlEnded in interface CrawlStatusListenersExitMessage - Type of exit. Should be one of the STATUS constants
in defined in CrawlJob.CrawlJobpublic void crawlEnding(java.lang.String sExitMessage)
CrawlStatusListener
crawlEnding in interface CrawlStatusListenersExitMessage - Type of exit. Should be one of the STATUS constants
in defined in CrawlJob.CrawlJobpublic void crawlPaused(java.lang.String statusMessage)
CrawlStatusListener
crawlPaused in interface CrawlStatusListenerstatusMessage - Should be
CrawlJob.STATUS_PAUSED. Passed for
conveniencepublic void crawlPausing(java.lang.String statusMessage)
CrawlStatusListener
crawlPausing in interface CrawlStatusListenerstatusMessage - Should be
STATUS_WAITING_FOR_PAUSE. Passed for conveniencepublic void crawlResuming(java.lang.String statusMessage)
CrawlStatusListener
crawlResuming in interface CrawlStatusListenerstatusMessage - Should be
CrawlJob.STATUS_RUNNING. Passed for
conveniencepublic void crawlStarted(java.lang.String message)
CrawlStatusListener
crawlStarted in interface CrawlStatusListenermessage - Start message.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||