PDF Tech

How to Use Foxit APIs to Create Tables in a PDF Document

by PDF SDK | November 21, 2022

Tables are a great way to organize data in a clear and concise manner. They’re often used to show the relationships between different variables that are too complicated to be adequately described via plain text. Tables are also often required in PDF documents, like when you need to convert Excel spreadsheets or database dumps such as CSV files into a PDF. In these cases, it’s important that the PDF document maintains a tabular structure after conversion.

Despite these being relatively common use cases, converting tabular data to PDF format and in a desired style doesn’t come easy sometimes. In this article, you’ll learn how to use the Foxit API and .NET to convert sample data in Excel and CSV formats into a styled table in a PDF document.

Using Foxit

You can use Foxit PDF Software Development Kit (SDK) to add robust and complete PDF library functionality to your application, including the ability to convert Excel spreadsheets or database dumps to tables in a PDF document.

Foxit PDF SDK’s easy-to-use and robust APIs enable developers to integrate PDF technology into their projects built on popular platforms such as Windows, Mac, and Linux. There are different versions of the SDK for popular languages like Python, C++, and C#, to name a few.

The Foxit SDK’s wide range of features makes manipulating PDF documents in any application a breeze. Some of these features include:

– Adding and removing watermarks or annotations

– Encryption

– Conversion of images to PDF and back

– Barcode generation

– Text extraction

– Table creation

This article focuses specifically on the creation of tables in PDF documents.

Creating Tables with the Foxit PDF SDK

In this section, you’ll use the Foxit PDF SDK in a small console application to extract data from an Excel or CSV file in a directory, create a table in a PDF document using that data, and output the document to another directory. This console app will be built with .NET, which utilizes the .NET Core version of the Foxit PDF SDK. You can find the demo console application codebase here.

Console Project Prerequisites

You need to have the .NET SDK installed on your local machine before proceeding to set up the console project. You can download it from the official download page. At the time of writing, the latest release was .NET SDK version 6.0.401.

Confirm that you have the .NET SDK installed by running the following command with the dotnet CLI in your terminal:

bash
dotnet --version

This should print the version of the .NET SDK. If it doesn’t, you need to redo the installation.

You also need to download the appropriate .NET Core Foxit PDF SDK for your operating system.

Note: The Visual Studio Code editor and the dotnet CLI are used in this article, but these are optional. You can opt to use the full-fledged Visual Studio IDE instead.

Now that you have .NET installed and the Foxit PDF SDK downloaded, it’s time to create a new .NET project.

Create a Console Project

The following steps will guide you through creating a new .NET console project.

Run the below command in your terminal to bootstrap a new console project:

bash
dotnet new console -n TablePDF -f "net6.0"

This will bootstrap a new console app with the name `TablePDF.

Open the newly bootstrapped project in your VS Code editor:

bash
code TablePDF

Create a lib folder in the root directory of your project. Open the lib folder in the .NET Core Foxit PDF SDK you downloaded earlier and copy the fsdk.dll and fsdk_dotnetcore.dll files from the x64_vc15 or x86_vc15 folder (depending on the architecture you’re targeting) into the lib folder you created earlier.

Then, copy the fsdk.dll library file from the lib folder to the root directory of your project.

The structure of your project should look like this:

TablePDF
├─ bin
├─ lib
│  ├─ fsdk.dll
│  └─ fsdk_dotnetcore.dll
├─ obj
├─ fsdk.dll
├─ Program.cs
└─ TablePDF.csproj

Update the TablePDF.csproj file to include fsdk_dotnetcore.dll as part of your project reference:

xml
<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

  <ItemGroup>
    <Reference Include="fsdk_dotnetcore">
      <HintPath>lib\fsdk_dotnetcore.dll</HintPath>
    </Reference>
  </ItemGroup>

</Project>

Next, you’re going to install additional dependencies that the console project needs:

DotNetEnv helps to access environment variables from a .env file

ExcelDataReader helps to read data from Excel and CSV files

ExcelDataReader.DataSet helps to read data from Excel and CSV files as data sets

System.Text.Encoding.CodePages helps extend .NET Core to support encodings needed when reading data from Excel and CSV files

Install the dependencies by running the commands below:

bash
dotnet add package DotNetEnv --version 2.3.0
dotnet add package ExcelDataReader --version 3.6.0
dotnet add package ExcelDataReader.DataSet --version 3.6.0
dotnet add package System.Text.Encoding.CodePages --version 6.0.0 

Update the Program.cs file with the following boilerplate code:

csharp
using System.Text;
using System.Data;

using foxit;
using foxit.common;
using foxit.common.fxcrt;
using foxit.addon;
using foxit.pdf;
using foxit.pdf.annots;

using DotNetEnv;
using ExcelDataReader;

namespace TablePDF
{
    internal class Program 
    {
        static void Main(string[] args) {
            Console.WriteLine("Hello, World!");
        }
    }
}

This declares all the namespaces you’d be using, a TablePDF namespace, and a Program class that has a Main method, which is the entry point of any .NET application.

You can create your build and run the project’s output by running dotnet run in the terminal.

Project Data

Your console project needs sample data in the form of Excel or CSV files that will be converted to PDF documents. This article uses some medical records as the sample data.

The following links provide some ready-to-use dummy medical data, extracted from New Zealand’s official data agency:

CSV

SQL

Excel

Download the various data formats from the links above and store them in a data folder in the root directory of your project in the following manner:

data
├─ serious-injury-outcome-indicators-2000-2020.csv
├─ serious-injury-outcome-indicators-2000-2020.sql
└─ serious-injury-outcome-indicators-2000-2020.xlsx

Initializing the Foxit Library

Initializing the Foxit library in your console project requires both a serial number and a key. These values can be retrieved from gsdk_sn.txt (the string after SN=) and gsdk_key.txt (the string after Sign=) in the lib folder of the Foxit PDF SDK you downloaded earlier.

Update the `Main` method of the Program.cs file in your project with the following code:

csharp
// ...

static void Main(string[] args) {

    // Load variables from .env file to Environment
    Env.Load();

    string sn = Environment.GetEnvironmentVariable("FOXIT_SDK_SN") ?? "";
    string key = Environment.GetEnvironmentVariable("FOXIT_SDK_KEY") ?? "";

    // Initialize Foxit library
    ErrorCode error_code = Library.Initialize(sn, key);
    if (error_code != ErrorCode.e_ErrSuccess)
    {
        Console.WriteLine("Library Initialize Error: {0}\n", error_code);
        return;
    }

    Library.Release();
}

Create a .env file in your project’s root directory and add the FOXIT_SDK_SN and FOXIT_SDK_KEY variables with your serial number and key as their values, respectively.

Run dotnet run again. If the initialization was successful, you won’t get any errors. If you happen to get any errors, kindly review your serial number and keys in the .env file and try again.

Reading the Excel File

The next step is to read or retrieve the data from the Excel or CSV file in your data folder.

Add the following properties to the Program class:

csharp
// ...

public static readonly string output_path = "./output/pdf/";
public static readonly string data_path = "./data/";

private static int row_count; // number of rows the data set has
private static int col_count; // number of columns the data set has
private static int num_of_pages; // number of PDF pages
private static readonly int row_per_page = 10; // number of rows per PDF page
private static readonly int col_per_page = 8; // number of columns per PDF page
private static DataSet? dataSet; // store for the read data

// ...

The output_path and data_path define the path to the directory where the PDF will be saved and where the Excel or CSV file will be fetched from. The other properties hold stats about the Excel data.

Add the following code to the Main method of the Program class:

csharp
static void Main(string[] args) {
    Console.WriteLine($"Generating Table PDF...");
    try
    {
        Directory.CreateDirectory(output_path);

        // Add encoding required to parse string in excel docs
        Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
        string input_file = data_path + "serious-injury-outcome-indicators-2000-2020.xlsx";

        using var stream = File.Open(input_file, FileMode.Open, FileAccess.Read);

        using var reader = ExcelReaderFactory.CreateReader(stream);
        row_count = reader.RowCount;
        col_count = reader.FieldCount > col_per_page ? col_per_page : reader.FieldCount;
        decimal pages = reader.RowCount / row_per_page;
        num_of_pages = (int) Math.Ceiling(pages);

        dataSet = reader.AsDataSet();

    } catch (Exception e)
    {
        Console.WriteLine($"{e.GetType()} : {e.Message}");
    }

    // ...
    
}

In the above code, an instance of CodePagesEncodingProvider is registered to provide support for encodings required to read Excel files. The input_file holds the path to the Excel file that should be read. ExcelReaderFactory.CreateReader(stream) creates a reader from the input file stream. The reader.RowCount and reader.FieldCount properties provide the number of rows and columns that the Excel file has, respectively. col_count limits the number of columns to the number defined by the col_per_page property of the Program class so it can fit properly into the PDF document that will be generated. The data read from the Excel file is then stored as a data set, thanks to the reader.AsDataSet() method exposed by the ExcelDataReader.DataSet package.

Creating Tables from Excel Data

You’re next going to create tables from the Excel file’s data set. Start by adding the following code to the Main method just above the Library.Release() method:

csharp
// ...

try
{
    using PDFDoc doc = new();

    for (int i = 0; i < num_of_pages; i++)
    {
        using PDFPage page = doc.InsertPage(i, PDFPage.Size.e_SizeLetter);
        AddElectronicTable(page, i);
    }

    // Save PDF file
    string output_file = output_path + "TablePDF.pdf";
    doc.SaveAs(output_file, (int)PDFDoc.SaveFlags.e_SaveFlagNoOriginal);
    Console.WriteLine("Done.");
}
catch (PDFException e)
{
    Console.WriteLine(e.Message);
}

Library.Release();

// ...

In the above code, the AddElectronicTable(page, i) method of the Program class is called for every PDF page that is created to add the table content of the data set. You’ll see the implementation of this method in the next code snippet. The doc.SaveAs() method saves the PDF file after the tables have been created.

Add the AddElectronicTable() method as defined below:

csharp
// ...

public static void AddElectronicTable(PDFPage page, int page_index)
{
   
    {
        using TableCellDataArray cell_array = new();

        DataTable? table = dataSet?.Tables[0];

        // Loop bounds
        int row_start = row_per_page * page_index;
        int row_end = row_start + row_per_page;
        int actual_row_end = row_end > row_count ? row_count : row_end;

        for (int row = row_start; row < actual_row_end; row++)
        {
            using RichTextStyle style = new();
            using TableCellDataColArray col_array = new();
            for (int col = 0; col < col_count; col++)
            {
                DataRow? actual_row = table?.Rows[row];
                DataColumn? actual_column = table?.Columns[col];
                string cell_text = $"{actual_row?[col]}";
                // Update text style
                SetTableTextStyle(row, style);
                using TableCellData cell_data = new(style, cell_text, new Image(), new RectF());
                col_array.Add(cell_data);
            }
            cell_array.Add(col_array);
        }

        float page_width = page.GetWidth();
        float page_height = page.GetHeight();

        TableBorderInfo outside_border_left = new()
        {
            line_width = 1,
            table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
        };
        TableBorderInfo outside_border_right = new()
        {
            line_width = 1,
            table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
        };
        TableBorderInfo outside_border_top = new()
        {
            line_width = 1,
            table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
        };
        TableBorderInfo outside_border_bottom = new()
        {
            line_width = 1,
            table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
        };
        TableBorderInfo inside_border_row_info = new()
        {
            line_width = 1,
            table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
        };
        TableBorderInfo inside_border_col_info = new()
        {
            line_width = 1,
            table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
        };
        using RectF rect = new(10, 200, page_width - 10, page_height - 40);
        using TableData data = new(rect, row_per_page, col_count, outside_border_left, outside_border_right, outside_border_top, outside_border_bottom, inside_border_row_info, inside_border_col_info, new TableCellIndexArray(), new FloatArray(), new FloatArray());
        TableGenerator.AddTableToPage(page, data, cell_array);
    }
}

// ...

The AddElectronicTable() method calls the SetTableTextStyle() method of the Program class to set the style of the text in the table. The set styles on the text include size, alignment, color, weight and font formatting (italic, underline, strikethrough, and so on).

Add the SetTableTextStyle() method as defined below:

csharp
// ...

public static void SetTableTextStyle(int index, RichTextStyle style)
{
    using (style.font = new Font(Font.StandardID.e_StdIDHelvetica)) { }
    style.text_size = 10;
    style.text_alignment = Alignment.e_AlignmentLeft;
    style.text_color = 0x000000;
    style.is_bold = index == 0;
    style.is_italic = false;
    style.is_underline = false;
    style.is_strikethrough = false;
    style.mark_style = RichTextStyle.CornerMarkStyle.e_CornerMarkNone;
}

// ...

The entire code of the Program.cs file is given below:

csharp
using System.Text;
using System.Data;

using foxit;
using foxit.common;
using foxit.common.fxcrt;
using foxit.addon;
using foxit.pdf;
using foxit.pdf.annots;

using DotNetEnv;
using ExcelDataReader;

namespace TablePDF
{
    internal class Program 
    {
        public static readonly string output_path = "./output/pdf/";
        public static readonly string data_path = "./data/";

        private static int row_count; // number of rows the data set has
        private static int col_count; // number of columns the data set has
        private static int num_of_pages; // number of PDF pages
        private static readonly int row_per_page = 10; // number of rows per PDF page
        private static readonly int col_per_page = 8; // number of columns per PDF page
        private static DataSet? dataSet; // store for the read data

        public static void SetTableTextStyle(int index, RichTextStyle style)
        {
            using (style.font = new Font(Font.StandardID.e_StdIDHelvetica)) { }
            style.text_size = 10;
            style.text_alignment = Alignment.e_AlignmentLeft;
            style.text_color = 0x000000;
            style.is_bold = index == 0;
            style.is_italic = false;
            style.is_underline = false;
            style.is_strikethrough = false;
            style.mark_style = RichTextStyle.CornerMarkStyle.e_CornerMarkNone;
        }

        public static void AddElectronicTable(PDFPage page, int page_index)
        {
           
            {
                using TableCellDataArray cell_array = new();

                DataTable? table = dataSet?.Tables[0];

                // Loop bounds
                int row_start = row_per_page * page_index;
                int row_end = row_start + row_per_page;
                int actual_row_end = row_end > row_count ? row_count : row_end;

                for (int row = row_start; row < actual_row_end; row++)
                {
                    using RichTextStyle style = new();
                    using TableCellDataColArray col_array = new();
                    for (int col = 0; col < col_count; col++)
                    {
                        DataRow? actual_row = table?.Rows[row];
                        DataColumn? actual_column = table?.Columns[col];
                        string cell_text = $"{actual_row?[col]}";
                        // Update text style
                        SetTableTextStyle(row, style);
                        using TableCellData cell_data = new(style, cell_text, new Image(), new RectF());
                        col_array.Add(cell_data);
                    }
                    cell_array.Add(col_array);
                }

                float page_width = page.GetWidth();
                float page_height = page.GetHeight();

                TableBorderInfo outside_border_left = new()
                {
                    line_width = 1,
                    table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
                };
                TableBorderInfo outside_border_right = new()
                {
                    line_width = 1,
                    table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
                };
                TableBorderInfo outside_border_top = new()
                {
                    line_width = 1,
                    table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
                };
                TableBorderInfo outside_border_bottom = new()
                {
                    line_width = 1,
                    table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
                };
                TableBorderInfo inside_border_row_info = new()
                {
                    line_width = 1,
                    table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
                };
                TableBorderInfo inside_border_col_info = new()
                {
                    line_width = 1,
                    table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
                };
                using RectF rect = new(10, 200, page_width - 10, page_height - 40);
                using TableData data = new(rect, row_per_page, col_count, outside_border_left, outside_border_right, outside_border_top, outside_border_bottom, inside_border_row_info, inside_border_col_info, new TableCellIndexArray(), new FloatArray(), new FloatArray());
                TableGenerator.AddTableToPage(page, data, cell_array);
            }
        }
        static void Main(string[] args) {
            Console.WriteLine($"Generating Table PDF...");
            try
            {
                Directory.CreateDirectory(output_path);

                // Add encoding required to parse string in excel docs
                Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
                string input_file = data_path + "serious-injury-outcome-indicators-2000-2020.xlsx";

                using var stream = File.Open(input_file, FileMode.Open, FileAccess.Read);

                using var reader = ExcelReaderFactory.CreateReader(stream);
                row_count = reader.RowCount;
                col_count = reader.FieldCount > col_per_page ? col_per_page : reader.FieldCount;
                decimal pages = reader.RowCount / row_per_page;
                num_of_pages = (int) Math.Ceiling(pages);

                dataSet = reader.AsDataSet();

            } catch (Exception e)
            {
                Console.WriteLine($"{e.GetType()} : {e.Message}");
            }

            // Load variables from .env file to Environment
            Env.Load();

            string sn = Environment.GetEnvironmentVariable("FOXIT_SDK_SN") ?? "";
            string key = Environment.GetEnvironmentVariable("FOXIT_SDK_KEY") ?? "";

            // Initialize Foxit library
            ErrorCode error_code = Library.Initialize(sn, key);
            if (error_code != ErrorCode.e_ErrSuccess)
            {
                Console.WriteLine("Library Initialize Error: {0}\n", error_code);
                return;
            }

            try
            {
                using PDFDoc doc = new();

                for (int i = 0; i < num_of_pages; i++)
                {
                    using PDFPage page = doc.InsertPage(i, PDFPage.Size.e_SizeLetter);
                    AddElectronicTable(page, i);
                }

                // Save PDF file
                string output_file = output_path + "TablePDF.pdf";
                doc.SaveAs(output_file, (int)PDFDoc.SaveFlags.e_SaveFlagNoOriginal);
                Console.WriteLine("Done.");
            }
            catch (PDFException e)
            {
                Console.WriteLine(e.Message);
            }

            Library.Release();
        }
    }
}

Run the dotnet run command to see the output of the code. A new PDF document will be created and stored as TablePDF.pdf in the output/pdf directory.

The PDF should look like this:

Changing Table Style

You can change the look of the table, such as its border thickness, the alignment and color of text, and so on.

Changing Border Thickness

You can change the border thickness by changing the line_width property of objects of the TableBorderInfo class. As an example, change the line_width property of the outside_border_left object inside the AddElectronicTable section like this:

csharp
// ...

TableBorderInfo outside_border_left = new()
{
    line_width = 4,
    table_border_style = TableBorderInfo.TableBorderStyle.e_TableBorderStyleSolid
};

// ...

This will increase the thickness of the outside left border as shown in the image below:

Changing Text Alignment and Color

You can alter text alignment and color by changing the text_alignment and text_color properties of the style object in the SetTableTextStyle method. As an example, change the following properties in the SetTableTextStyle method:

csharp
// ...

style.text_alignment = Alignment.e_AlignmentCenter;
style.text_color = 0x00ff00;

// ...

This will center-align the text in the table and change its color to green like in the image below:

Conclusion

In this article, you learned how to set up a console application that converts an Excel file to a PDF document. You also saw how to style the table and its text content.

On top of what you’ve seen here, Foxit’s API can expose a wide range of classes and methods. In short, the Foxit API should be your go-to for all PDF document manipulation. Foxit’s official API guide and reference are the best places to start if you want to further explore Foxit’s features.

Foxit is the industry leader in PDF SDK technology. Its powerful, robust, and thoroughly documented PDF SDK allows you to add complete PDF functionality to your project. Sign up for a free trial to get the most out of this awesome SDK.

Author: Gideon Idoko