|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectjavax.management.Attribute
org.archive.crawler.settings.Type
org.archive.crawler.settings.ComplexType
org.archive.crawler.settings.ModuleType
org.archive.crawler.framework.Processor
org.archive.crawler.extractor.Extractor
org.archive.crawler.extractor.ExtractorSWF
public class ExtractorSWF
Process SWF (flash/shockwave) files for strings that are likely to be crawlable URIs.
| Nested Class Summary | |
|---|---|
protected class |
ExtractorSWF.ExtractorSWFActions
SWFActions that parse URI-like strings. |
(package private) class |
ExtractorSWF.ExtractorSWFReader
|
protected class |
ExtractorSWF.ExtractorSWFTags
SWFTagTypes customized to use ExtractorSWFActions, which
parse URI-like strings. |
protected class |
ExtractorSWF.ExtractorTagParser
TagParser customized to ignore SWFTags that will never contain extractable URIs. |
| Nested classes/interfaces inherited from class org.archive.crawler.settings.ComplexType |
|---|
ComplexType.MBeanAttributeInfoIterator |
| Field Summary | |
|---|---|
protected long |
numberOfCURIsHandled
|
protected long |
numberOfLinksExtracted
|
| Fields inherited from class org.archive.crawler.framework.Processor |
|---|
ATTR_DECIDE_RULES, ATTR_ENABLED, attrDecideRules |
| Fields inherited from class org.archive.crawler.settings.ComplexType |
|---|
definition, definitionMap |
| Constructor Summary | |
|---|---|
ExtractorSWF(java.lang.String name)
|
|
| Method Summary | |
|---|---|
protected void |
extract(CrawlURI curi)
|
java.lang.String |
report()
Compiles and returns a report (in human readable form) about the status of the processor. |
| Methods inherited from class org.archive.crawler.extractor.Extractor |
|---|
innerProcess |
| Methods inherited from class org.archive.crawler.framework.Processor |
|---|
checkForInterrupt, finalTasks, getController, getDecideRule, getDefaultNextProcessor, initialTasks, innerRejectProcess, isContentToProcess, isExpectedMimeType, isHttpTransactionContentToProcess, kickUpdate, process, rulesAccept, rulesAccept, setDefaultNextProcessor, spawn |
| Methods inherited from class org.archive.crawler.settings.ModuleType |
|---|
addElement, listUsedFiles |
| Methods inherited from class org.archive.crawler.settings.Type |
|---|
addConstraint, equals, getConstraints, getLegalValueType, isExpertSetting, isOverrideable, isTransient, setExpertSetting, setLegalValueType, setOverrideable, setTransient |
| Methods inherited from class javax.management.Attribute |
|---|
getName |
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
protected long numberOfCURIsHandled
protected long numberOfLinksExtracted
| Constructor Detail |
|---|
public ExtractorSWF(java.lang.String name)
name - | Method Detail |
|---|
protected void extract(CrawlURI curi)
extract in class Extractorpublic java.lang.String report()
Processor
Examples of stats declared would include:
* Number of CrawlURIs handled.
* Number of links extracted (for link extractors)
etc.
report in class Processor
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||