|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectjava.util.AbstractCollection<E>
java.util.AbstractSet<E>
java.util.TreeSet<java.lang.String>
org.archive.util.PrefixSet
org.archive.util.SurtPrefixSet
public class SurtPrefixSet
Specialized TreeSet for keeping a set of String prefixes. Redundant prefixes (those that are themselves prefixed by other set entries) are eliminated.
| Constructor Summary | |
|---|---|
SurtPrefixSet()
|
|
| Method Summary | |
|---|---|
void |
convertAllPrefixesToDomains()
Changes all prefixes so that they only enforce a general domain (allowing subdomains).For prefixes that don't include a ')', no change is necessary. |
void |
convertAllPrefixesToHosts()
Changes all prefixes so that they enforce an exact host. |
static java.lang.String |
convertPrefixToDomain(java.lang.String prefix)
|
static java.lang.String |
convertPrefixToHost(java.lang.String prefix)
|
void |
exportTo(java.io.FileWriter fw)
|
static java.lang.String |
getCandidateSurt(java.lang.Object object)
Calculate the SURT form URI to use as a candidate against prefixes from the given Object (CandidateURI or UURI) |
void |
importFrom(java.io.Reader r)
Read a set of SURT prefixes from a reader source; keep sorted and with redundant entries removed. |
void |
importFromMixed(java.io.Reader r,
boolean deduceFromSeeds)
Import SURT prefixes from a reader with mixed URI and SURT prefix format. |
void |
importFromUris(java.io.Reader r)
|
static void |
main(java.lang.String[] args)
Allow class to be used as a command-line tool for converting URL lists (or naked host or host/path fragments implied to be HTTP URLs) to implied SURT prefix form. |
static java.lang.String |
prefixFromPlain(java.lang.String u)
Given a plain URI or hostname/hostname+path, deduce an implied SURT prefix from it. |
| Methods inherited from class org.archive.util.PrefixSet |
|---|
add, containsPrefixOf |
| Methods inherited from class java.util.TreeSet |
|---|
addAll, clear, clone, comparator, contains, first, headSet, isEmpty, iterator, last, remove, size, subSet, tailSet |
| Methods inherited from class java.util.AbstractSet |
|---|
equals, hashCode, removeAll |
| Methods inherited from class java.util.AbstractCollection |
|---|
containsAll, retainAll, toArray, toArray, toString |
| Methods inherited from class java.lang.Object |
|---|
finalize, getClass, notify, notifyAll, wait, wait, wait |
| Methods inherited from interface java.util.Set |
|---|
containsAll, equals, hashCode, removeAll, retainAll, toArray, toArray |
| Constructor Detail |
|---|
public SurtPrefixSet()
| Method Detail |
|---|
public void importFrom(java.io.Reader r)
r - reader over file of SURT_format strings
java.io.IOExceptionpublic void importFromUris(java.io.Reader r)
r - Where to read from.
public void importFromMixed(java.io.Reader r,
boolean deduceFromSeeds)
r - the reader to import the prefixes fromdeduceFromSeeds - true to also import SURT prefixes implied
from normal URIs/hostname seedspublic static java.lang.String prefixFromPlain(java.lang.String u)
u - URI or almost-URI to consider
public static java.lang.String getCandidateSurt(java.lang.Object object)
object - CandidateURI or UURI
public void exportTo(java.io.FileWriter fw)
throws java.io.IOException
fw -
java.io.IOExceptionpublic void convertAllPrefixesToHosts()
public static java.lang.String convertPrefixToHost(java.lang.String prefix)
public void convertAllPrefixesToDomains()
public static java.lang.String convertPrefixToDomain(java.lang.String prefix)
public static void main(java.lang.String[] args)
throws java.io.IOException
args - cmd-line arguments: may include input file
java.io.IOException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||