|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.admin | |
|---|---|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.datamodel | |
|---|---|
| CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlHost
Represents a single remote "host". |
|
| CrawlServer
Represents a single remote "server". |
|
| CrawlSubstats
Collector of statististics for a 'subset' of a crawl, such as a server (host:port), host, or frontier group (eg queue). |
|
| CrawlSubstats.HasCrawlSubstats
|
|
| CrawlSubstats.Stage
|
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| CredentialStore
Front door to the credential store. |
|
| FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
| RobotsDirectives
Represents the directives that apply to a user-agent (or set of user-agents) |
|
| RobotsExclusionPolicy
RobotsExclusionPolicy represents the actual policy adopted with respect to a specific remote server, usually constructed from consulting the robots.txt, if any, the server provided. |
|
| RobotsHonoringPolicy
RobotsHonoringPolicy represent the strategy used by the crawler for determining how robots.txt files will be honored. |
|
| Robotstxt
Utility class for parsing and representing 'robots.txt' format directives, into a list of named user-agents and map from user-agents to RobotsDirectives. |
|
| UriUniqFilter.HasUriReceiver
URIs that have not been seen before 'visit' this 'Visitor'. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.datamodel.credential | |
|---|---|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.deciderules | |
|---|---|
| CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.deciderules.recrawl | |
|---|---|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.event | |
|---|---|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.extractor | |
|---|---|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.fetcher | |
|---|---|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlHost
Represents a single remote "host". |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
| ServerCache
Server and Host cache. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.filter | |
|---|---|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.framework | |
|---|---|
| CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
| Checkpoint
Record of a specific checkpoint on disk. |
|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlOrder
Represents the 'root' of the settings hierarchy. |
|
| CrawlSubstats.HasCrawlSubstats
|
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
| ServerCache
Server and Host cache. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.frontier | |
|---|---|
| CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlServer
Represents a single remote "server". |
|
| CrawlSubstats
Collector of statististics for a 'subset' of a crawl, such as a server (host:port), host, or frontier group (eg queue). |
|
| CrawlSubstats.HasCrawlSubstats
|
|
| CrawlSubstats.Stage
|
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
| UriUniqFilter
A UriUniqFilter passes URI objects to a destination (receiver) if the passed URI object has not been previously seen. |
|
| UriUniqFilter.HasUriReceiver
URIs that have not been seen before 'visit' this 'Visitor'. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.io | |
|---|---|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.postprocessor | |
|---|---|
| CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.prefetch | |
|---|---|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlSubstats.HasCrawlSubstats
|
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.processor | |
|---|---|
| CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.processor.recrawl | |
|---|---|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.scope | |
|---|---|
| CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.selftest | |
|---|---|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.settings | |
|---|---|
| CrawlOrder
Represents the 'root' of the settings hierarchy. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.url | |
|---|---|
| CrawlOrder
Represents the 'root' of the settings hierarchy. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.util | |
|---|---|
| CandidateURI
A URI, discovered or passed-in, that may be scheduled. |
|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| UriUniqFilter
A UriUniqFilter passes URI objects to a destination (receiver) if the passed URI object has not been previously seen. |
|
| UriUniqFilter.HasUriReceiver
URIs that have not been seen before 'visit' this 'Visitor'. |
|
| Classes in org.archive.crawler.datamodel used by org.archive.crawler.writer | |
|---|---|
| CoreAttributeConstants
CrawlURI attribute keys used by the core crawler classes. |
|
| CrawlURI
Represents a candidate URI and the associated state it collects as it is crawled. |
|
| FetchStatusCodes
Constant flag codes to be used, in lieu of per-protocol codes (like HTTP's 200, 404, etc.), when network/internal/ out-of-band conditions occur. |
|
|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||