|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.faceless.pdf2.PageExtractor.Text
public abstract class PageExtractor.Text
A class representing a piece of text which is extracted from the PageExtractor
. Each text object has a location on the page, font-size, font-name,
color and text.
Constructor Summary | |
---|---|
PageExtractor.Text()
|
Method Summary | |
---|---|
abstract int |
compareTo(Object o)
|
AnnotationMarkup |
createAnnotationMarkup(String type)
Create a new AnnotationMarkup of the specified type
to cover this text. |
float |
getAngle()
Return the angle of rotation of this text on the page, in degrees clockwise from 12 o'clock. |
abstract float |
getBaseline()
Return the baseline of the text item, as a fraction between 0 and 1. 0 would indicate the baseline is at the top of the text, 1 at the absolute bottom. |
abstract int |
getByteLength()
Get the length of the original text in bytes. |
abstract int |
getByteToCharOffset(int byteoffset)
Given a byte offset into the original String, return the Character offset it refers to. |
abstract Paint |
getColor()
Return the color of this text |
float[] |
getCorners()
Return the four corners (x1,y1) (x2,y2) (x3,y3) (x4,y4) of the quadrilateral that encompasses the text, specified clockwise from bottom left. |
abstract Reader |
getFontMetaData()
Return any XMP MetaData that has been set on the Font, or null
if none exists. |
abstract String |
getFontName()
Return the font name of this text |
abstract float |
getFontSize()
Return the font size of this text in points |
float |
getLength()
Return the length of this Text in points. |
abstract float |
getOffset(int pos)
Given an offset into the text, return the start position of that letter. |
PDFPage |
getPage()
Return the PDFPage this text was found on - simply the page
the parent PageExtractor was created from. |
PageExtractor |
getPageExtractor()
Return the PageExtractor this text was created from |
abstract PageExtractor.Text |
getPrimaryText()
If this text is a subtext or collection of Text object, return the primary text it starts with. |
abstract int |
getPrimaryTextOffset()
If this text is a subtext or collection of Text object, return the offset into the primary text where it starts. |
abstract PageExtractor.Text |
getRowNext()
Return the next Text item in this row, or null if there are none |
abstract PageExtractor.Text |
getRowPrevious()
Return the next Text item in this row, or null if there are none |
abstract PageExtractor.Text |
getSubText(int off,
int len)
Return a substring of this Text object as another Text object |
abstract String |
getText()
Return the text content of this text |
abstract int |
getTextLength()
Return the length of the String returned by getText() |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public PageExtractor.Text()
Method Detail |
---|
public float getLength()
public final float[] getCorners()
public AnnotationMarkup createAnnotationMarkup(String type)
AnnotationMarkup
of the specified type
to cover this text. The annotation is not added to the page
type
- the type of markup - "Highlight", "Underline" etc.public final float getAngle()
public abstract float getFontSize()
public abstract float getBaseline()
public abstract float getOffset(int pos)
float left = text.getCorners()[0] + (text.getOffset(pos) * text.getLength());
pos
- the position of the letter in the Text to retrive the position for.
In the range 0 to getText().length() - 1
public PDFPage getPage()
PDFPage
this text was found on - simply the page
the parent PageExtractor
was created from.
public PageExtractor getPageExtractor()
PageExtractor
this text was created from
public abstract Paint getColor()
public abstract String getFontName()
public abstract String getText()
public abstract int getTextLength()
getText()
public abstract int compareTo(Object o)
compareTo
in interface Comparable
public abstract PageExtractor.Text getRowNext()
null
if there are none
public abstract PageExtractor.Text getRowPrevious()
null
if there are none
public abstract Reader getFontMetaData() throws IOException
null
if none exists.
IOException
PDF.getMetaData()
public abstract PageExtractor.Text getSubText(int off, int len)
off
- the offset into the textlen
- the number of characters to returnpublic abstract PageExtractor.Text getPrimaryText()
null
public abstract int getPrimaryTextOffset()
primary text
where it starts.
If not, returns 0
public abstract int getByteLength()
public abstract int getByteToCharOffset(int byteoffset)
getByteLength()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |