Programmatically determine the content area of a page in a PDF
It’s easy to determine the size of a page by using the PageHeight and PageWidth, but it’s a little more difficult to determine the rectangular content area of a page. There are functions in the library for returning the text coordinates such as GetPageText and there’s functions for determining the coordinates of images, but there’s no easy way to programmatically detect the coordinates of lines and other shapes on the page.
For many cases a relatively simple way to determine the content area of the page is to check the pixels of the page. Measuring the height of a page can be done by rendering the page to bitmap format and looping through the pixel data to find the lowest row where not all of the pixels are white.
Here is some sample code written in Delphi which demonstrates how to do this:
// A Delphi example of code to return the height of the // selected page in current measurement units function MeasurePageContentHeight(QP: TQuickPDF): Double; var BM: TBitmap; MS: TMemoryStream; RowNumber: Integer; FoundPixels: Boolean; Column: Integer; PixelData: array of Byte; begin Result := 0; MS := TMemoryStream.Create; try QP.RenderPageToStream(96, QP.SelectedPage, 0, MS); MS.Seek(0, soFromBeginning); BM := TBitmap.Create; try BM.LoadFromStream(MS); if (BM.PixelFormat = pf24bit) then begin SetLength(PixelData, BM.Width * 3); RowNumber := BM.Height; FoundPixels := False; while (not FoundPixels) and (RowNumber > 0) do begin Dec(RowNumber); Move(BM.ScanLine[RowNumber]^, PixelData, BM.Width * 3); for Column := 0 to (BM.Width * 3) - 1 do begin if (PixelData[Column] <> $FF) then begin FoundPixels := True; end; end; end; end; if (FoundPixels) then begin Result := Round((QP.PageHeight * RowNumber) / BM.Height); end; finally BM.Free; end; finally MS.Free; end; end;
This article refers to a deprecated product. If you are looking for support for Foxit PDF SDK, please click here.
Updated on May 16, 2022