Create tagged PDF

Create tagged PDF

The following sections demonstrate how to create a new tagged PDF. Here is what the end result looks like in Adobe Acrobat:

Create New Tagged

Initialize logical structure

The following code initializes the logical structure:

Document document = new Document();

LogicalStructure logicalStructure = new LogicalStructure();
document.LogicalStructure = logicalStructure;

document.Pages.Add(new Page(600, 800));
Page page = document.Pages[0];

Tag documentTag = new Tag("Document", logicalStructure.RootTag);

Add an image with a caption:

ImageShape image = new ImageShape("logo_pdfkit.gif");
image.Transform = new TranslateTransform(480, 710);
Tag imageTag = new Tag("Figure", documentTag);
image.ParentTag = imageTag;
image.ParentTag.AlternateDescription = "TallComponents logo";
page.Overlay.Add(image);

Tag paragraphTag = new Tag("P", documentTag);

TextShape caption = new TextShape(470, 695, "by TallComponents", new TallComponents.PDF.Fonts.Font(), 10);
caption.ParentTag = new Tag("Caption", imageTag);
page.Overlay.Add(caption);

Add a heading:

TextShape heading = new TextShape(50, 650, "Creating tagged PDF", new TallComponents.PDF.Fonts.Font(), 20);
heading.ParentTag = new Tag("H1", paragraphTag);
page.Overlay.Add(heading);

Add multiline text:

MultilineTextShape exampleText = new MultilineTextShape(50, 640, 500);
exampleText.FirstLineIndentation = 50;
exampleText.Fragments.Add(new Fragment(
  "A simple demonstration how to create tagged document. Above is the tagged logo of TallComponents and a tagged heading."));
exampleText.Fragments.Add(new Fragment(
  "Next are some more examples for MultilineTextShape and SimpleXHtmlShape."));
exampleText.ParentTag = new Tag("Span", paragraphTag);
page.Overlay.Add(exampleText);

MultilineTextShape multilineTextShape1 = new MultilineTextShape(50, 580, 200);
multilineTextShape1.ParentTag = new Tag("Div", paragraphTag);

Fragment f11 = new Fragment("MultiLineTextShape can be tagged as a Shape.");
multilineTextShape1.Fragments.Add(f11);
Fragment f21 = new Fragment("In this case all fragments in the shape will be under one Tag.");
multilineTextShape1.Fragments.Add(f21);

MultilineTextShape multilineTextShape2 = new MultilineTextShape(320, 580, 200);
Fragment f12 = new Fragment("Or each Fragment in the MultilineTextShape can be tagged.");
f12.ParentTag = new Tag("Span", paragraphTag);
multilineTextShape2.Fragments.Add(f12);
Fragment f22 = new Fragment("In This case all Fragments will have their own Tag.");
f22.ParentTag = new Tag("Span", paragraphTag);
multilineTextShape2.Fragments.Add(f22);

page.Overlay.Add(multilineTextShape1);
page.Overlay.Add(multilineTextShape2);

SimpleXhtmlShape:

The SimpleXhtmlShape auto-tags its inner content. Here is the code that adds the code:

SimpleXhtmlShape simpleXhtmlShape = new SimpleXhtmlShape();
simpleXhtmlShape.Transform = new TranslateTransform(50, 500);
simpleXhtmlShape.Width = 450;

string xhtml = "<?xml version='1.0'?><body xfa:APIVersion=\"PDFKit:3.0.0.0\" xfa:spec=\"2.1\" xmlns=\"http://www.w3.org/1999/xhtml\" xmlns:xfa=\"http://www.xfa.org/schema/xfa-data/1.0/\">";
xhtml += "<p>A SimpleXhtmlShape is going to be <b>tagged</b> ";
xhtml += "based on the <span style ='text-decoration:underline'>html content</span>.";
xhtml += "It means, each Xml <span>element</span> will have <b>its own</b> Tag.</p>";
xhtml += "</body>";

simpleXhtmlShape.Text = xhtml;
simpleXhtmlShape.ParentTag = new Tag("Div", paragraphTag);
page.Overlay.Add(simpleXhtmlShape);

As can be seen here, a tag hierarchy is build for the XHTML content:

Xhtml Tagged

Role map

The Tagged PDF conventions list standard roles for tags such as . It is possible to introduce application specific roles. If you do, you should also provide mappings from these custom roles to the nearest standard roles. This will help tools that process tagged PDF to deal with your roles in the best way possible. This can be done as follows:

logicalStructure.RoleMap = new RoleMap();
logicalStructure.RoleMap.Add("mypar", "P");

Write the PDF

using (FileStream fs = new FileStream("tagged-pdf.pdf", FileMode.Create))
{
  document.Write(fs);
}