Programmatically determine the content area of a page in a PDF
It’s easy to determine the size of a page by using the PageHeight and PageWidth, but it’s a little more difficult to determine the rectangular content area of a page. There are functions in the library for returning the text coordinates such as GetPageText and there’s functions for determining the coordinates of images, but there’s no easy way to programmatically detect the coordinates of lines and other shapes on the page.
For many cases a relatively simple way to determine the content area of the page is to check the pixels of the page. Measuring the height of a page can be done by rendering the page to bitmap format and looping through the pixel data to find the lowest row where not all of the pixels are white.
Here is some sample code written in Delphi which demonstrates how to do this:
// A Delphi example of code to return the height of the
// selected page in current measurement units
function MeasurePageContentHeight(QP: TQuickPDF): Double;
var
BM: TBitmap;
MS: TMemoryStream;
RowNumber: Integer;
FoundPixels: Boolean;
Column: Integer;
PixelData: array of Byte;
begin
Result := 0;
MS := TMemoryStream.Create;
try
QP.RenderPageToStream(96, QP.SelectedPage, 0, MS);
MS.Seek(0, soFromBeginning);
BM := TBitmap.Create;
try
BM.LoadFromStream(MS);
if (BM.PixelFormat = pf24bit) then
begin
SetLength(PixelData, BM.Width * 3);
RowNumber := BM.Height;
FoundPixels := False;
while (not FoundPixels) and (RowNumber > 0) do
begin
Dec(RowNumber);
Move(BM.ScanLine[RowNumber]^, PixelData[0], BM.Width * 3);
for Column := 0 to (BM.Width * 3) - 1 do
begin
if (PixelData[Column] <> $FF) then
begin
FoundPixels := True;
end;
end;
end;
end;
if (FoundPixels) then
begin
Result := Round((QP.PageHeight * RowNumber) / BM.Height);
end;
finally
BM.Free;
end;
finally
MS.Free;
end;
end;
This article refers to a deprecated product. If you are looking for support for Foxit PDF SDK, please click here.
Updated on May 16, 2022