|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectjavax.management.Attribute
org.archive.crawler.settings.Type
org.archive.crawler.settings.ComplexType
org.archive.crawler.settings.ModuleType
org.archive.crawler.framework.Processor
org.archive.crawler.processor.recrawl.PersistProcessor
public abstract class PersistProcessor
Superclass for Processors which utilize BDB-JE for URI state (including most notably history) persistence.
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType |
|---|
ComplexType.MBeanAttributeInfoIterator |
| Field Summary | |
|---|---|
static java.lang.String |
URI_HISTORY_DBNAME
name of history Database |
| Fields inherited from class org.archive.crawler.framework.Processor |
|---|
ATTR_DECIDE_RULES, ATTR_ENABLED, attrDecideRules |
| Fields inherited from class org.archive.crawler.settings.ComplexType |
|---|
definition, definitionMap |
| Constructor Summary | |
|---|---|
PersistProcessor(java.lang.String name,
java.lang.String string)
Usual constructor |
|
| Method Summary | |
|---|---|
static int |
copyPersistSourceToHistoryMap(java.io.File context,
java.lang.String sourcePath,
com.sleepycat.collections.StoredSortedMap<java.lang.String,st.ata.util.AList> historyMap)
Populates a given StoredSortedMap (history map) from an old environment db or a persist log. |
protected static com.sleepycat.je.DatabaseConfig |
historyDatabaseConfig()
|
static void |
main(java.lang.String[] args)
Utility main for importing a log into a BDB-JE environment or moving a database between environments (2 arguments), or simply dumping a log to stderr in a more readable format (1 argument). |
java.lang.String |
persistKeyFor(CrawlURI curi)
Return a preferred String key for persisting the given CrawlURI's AList state. |
static int |
populatePersistEnv(java.lang.String sourcePath,
java.io.File envFile)
Populates a new environment db from an old environment db or a persist log. |
protected boolean |
shouldLoad(CrawlURI curi)
Whether the current CrawlURI's state should be loaded |
protected boolean |
shouldStore(CrawlURI curi)
Whether the current CrawlURI's state should be persisted (to log or direct to database) |
| Methods inherited from class org.archive.crawler.framework.Processor |
|---|
checkForInterrupt, finalTasks, getController, getDecideRule, getDefaultNextProcessor, initialTasks, innerProcess, innerRejectProcess, isContentToProcess, isExpectedMimeType, isHttpTransactionContentToProcess, kickUpdate, process, report, rulesAccept, rulesAccept, setDefaultNextProcessor, spawn |
| Methods inherited from class org.archive.crawler.settings.ModuleType |
|---|
addElement, listUsedFiles |
| Methods inherited from class org.archive.crawler.settings.Type |
|---|
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient |
| Methods inherited from class javax.management.Attribute |
|---|
getName |
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String URI_HISTORY_DBNAME
| Constructor Detail |
|---|
public PersistProcessor(java.lang.String name,
java.lang.String string)
name - string - | Method Detail |
|---|
protected static com.sleepycat.je.DatabaseConfig historyDatabaseConfig()
public java.lang.String persistKeyFor(CrawlURI curi)
curi - CrawlURI
protected boolean shouldStore(CrawlURI curi)
curi - CrawlURI
protected boolean shouldLoad(CrawlURI curi)
curi - CrawlURI
public static int populatePersistEnv(java.lang.String sourcePath,
java.io.File envFile)
throws com.sleepycat.je.DatabaseException,
java.io.IOException
sourcePath - source of old entries: can be a path to an existing
environment db, or a URL or path to a persist logenvFile - path to new environment db (or null for a dry run)
com.sleepycat.je.DatabaseException
java.io.IOException
public static int copyPersistSourceToHistoryMap(java.io.File context,
java.lang.String sourcePath,
com.sleepycat.collections.StoredSortedMap<java.lang.String,st.ata.util.AList> historyMap)
throws com.sleepycat.je.DatabaseException,
java.io.IOException,
java.net.MalformedURLException,
java.io.UnsupportedEncodingException
sourcePath - source of old entries: can be a path to an existing
environment db, or a URL or path to a persist loghistoryMap - map to populate (or null for a dry run)
com.sleepycat.je.DatabaseException
java.io.IOException
java.net.MalformedURLException
java.io.UnsupportedEncodingException
public static void main(java.lang.String[] args)
throws com.sleepycat.je.DatabaseException,
java.io.IOException
args - command-line arguments
com.sleepycat.je.DatabaseException
java.io.IOException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||