`HTMLWorker’ has been deprecated in favor of XMLWorker. Here is a working example tested with a snippet of HTML like you used above:
StringReader html = new StringReader(@"
<div style="font-size: 18pt; font-weight: bold;">
Mouser Electronics <br />Authorized Distributor</div><br /> <br />
<div style="font-size: 14pt;">Click to View Pricing, Inventory, Delivery & Lifecycle Information:
</div>
<br />
<div>
<table>
<tr><td></td><td>
<a href="http://www.mouser.com/access/?pn=78211-009"
style="color: Blue; font-size: 10pt; text-decoration: underline;">78211-009</a></td></tr>
</table></div>
");
using (Document document = new Document()) {
PdfWriter writer = PdfWriter.GetInstance(document, STREAM);
document.Open();
XMLWorkerHelper.GetInstance().ParseXHtml(
writer, document, html
);
}
When using XMLWorker
you need to use well-formed HTML – it’s an XML parser, after all. The sample HTML
from your question above doesn’t have closing <a>
or <br>
tags. A HTML
parser like HtmlAgilityPack will fix those problems, and turn this:
<div><img src="https://stackoverflow.com/questions/12113425/a.gif"><br><hr></div>
into this:
<div><img src="https://stackoverflow.com/questions/12113425/a.gif" /><br /><hr /></div>
with only a few lines of code:
var hDocument = new HtmlDocument()
{
OptionWriteEmptyNodes = true,
OptionAutoCloseOnEnd = true
};
hDocument.LoadHtml("<div><img src="https://stackoverflow.com/questions/12113425/a.gif"><br><hr></div>");
var closedTags = hDocument.DocumentNode.WriteTo();
XMLWorker
is available as a nuget package, or as a separate download at sourceforge.
See here for more advanced usage of XMLWorker
.