* Pdf to PPT
// Load PDF document
Aspose.Pdf.Document doc = new Aspose.Pdf.Document(@"C:\pdftest\IN_7664539.pdf");
// Instantiate PptxSaveOptions instance
Aspose.Pdf.PptxSaveOptions pptx_save = new Aspose.Pdf.PptxSaveOptions();
// Save the output in PPTX format
doc.Save("c:/pdftest/IN_7664539.pptx", pptx_save);
-----------------------------------------------------
* Pdf to HTML
public static void SavingOfAllPageHtmlsApart() { Document doc = new Document(@"C:\PDFTest\NimbusSRP.pdf"); // Pay attention that we put non-existing path here : since we use custon resource processing it won't be in use. // If You forget implement some of required saving strategies(CustomHtmlSavingStrategy,CustomResourceSavingStrategy,CustomCssSavingStrategy), then saving will return "Path not found" exception string outHtmlFile = @"T:\SomeNonExistingFolder\NimbusSRP.html"; // Create HtmlSaveOption with custom saving strategies that will do all the saving job // in such approach You can split HTML in pages if You will HtmlSaveOptions saveOptions = new HtmlSaveOptions(); saveOptions.SplitIntoPages = true; saveOptions.CustomHtmlSavingStrategy = new HtmlSaveOptions.HtmlPageMarkupSavingStrategy(StrategyOfSavingHtml); saveOptions.CustomResourceSavingStrategy = new HtmlSaveOptions.ResourceSavingStrategy(CustomSaveOfFontsAndImages); saveOptions.CustomStrategyOfCssUrlCreation = new HtmlSaveOptions.CssUrlMakingStrategy(CssUrlMakingStrategy); saveOptions.CustomCssSavingStrategy = new HtmlSaveOptions.CssSavingStrategy(CustomSavingOfCss); saveOptions.FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats; saveOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground; doc.Save(outHtmlFile, saveOptions); Console.WriteLine("Done"); Console.ReadLine(); }
*PDF to excel
// Load PDF document Aspose.Pdf.Document doc = new Aspose.Pdf.Document(@"C:\input.pdf"); // Instantiate ExcelSave Option object Aspose.Pdf.ExcelSaveOptions excelsave = new ExcelSaveOptions(); // Save the output in XLS format doc.Save("c:/Resultant.xls", excelsave);
------------------------------------------------
** Text file to Pdf file
// Read the source text file TextReader tr = new StreamReader(myDir + "Formtext.txt"); // Instantiate a Document object by calling its empty constructor Document doc = new Document(); // Add a new page in Pages collection of Document Page page = doc.Pages.Add(); // Create an instance of TextFragmet and pass the text from reader object to its constructor as argument TextFragment text = new TextFragment(tr.ReadToEnd()); //text.TextState.Font = FontRepository.FindFont("Arial Unicode MS"); // Add a new text paragraph in paragraphs collection and pass the TextFragment object page.Paragraphs.Add(text); // Save resultant PDF file doc.Save(myDir+"TexttoPDF.pdf");
---------------------------------------------------------
XML
<?xml version="1.0" encoding="utf-8" ?> <Document xmlns="Aspose.Pdf"> <Page id="mainSection"> <TextFragment> <TextSegment id="boldHtml">segment1</TextSegment> </TextFragment> <TextFragment> <TextSegment id="strongHtml">segment2</TextSegment> </TextFragment> </Page> </Document>
----------------------------------------------------
C#
// instantiate Document object Document doc = new Document(); // bind source XML file doc.BindXml("source.xml"); // get reference of page object from XML Page page = (Page)doc.GetObjectById("mainSection"); // get reference of first TextSegment with ID boldHtml TextSegment segment = (TextSegment)doc.GetObjectById("boldHtml"); // get reference of second TextSegment with ID strongHtml segment = (TextSegment)doc.GetObjectById("strongHtml"); // save resultant PDF file doc.Save("Resultant.pdf");
-----------------------------------------------
***PDF to Excel
C#
// Load PDF document Aspose.Pdf.Document doc = new Aspose.Pdf.Document(@"C:\input.pdf"); // Instantiate ExcelSave Option object Aspose.Pdf.ExcelSaveOptions excelsave = new ExcelSaveOptions(); // Save the output in XLS format doc.Save("c:/Resultant.xls", excelsave);
-----------------------------------------
Schema
The schema is extended with the ability to use external fonts. Furthermore, when converting PDF files to XML, images are represented as separate files in the same directory as the output XML is created. Fonts are represented as TrueType fonts and the corresponding files (filename_fontN.ttf) are created along with the output XML.XML is formed in accordance with the DTD schema specified below:
XML
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="pdf2xml"> <xs:complexType> <xs:sequence> <xs:element type="xs:string" name="title"/> <xs:element name="page" maxOccurs="unbounded" minOccurs="0"> <xs:complexType> <xs:sequence> <xs:element name="font" maxOccurs="unbounded" minOccurs="0"> <xs:complexType> <xs:sequence> <xs:element name="text" maxOccurs="unbounded" minOccurs="0"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute type="xs:float" name="x" use="optional"/> <xs:attribute type="xs:float" name="y" use="optional"/> <xs:attribute type="xs:float" name="width" use="optional"/> <xs:attribute type="xs:float" name="height" use="optional"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> <xs:element name="img" minOccurs="0"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute type="xs:float" name="x" use="optional"/> <xs:attribute type="xs:float" name="y" use="optional"/> <xs:attribute type="xs:float" name="width" use="optional"/> <xs:attribute type="xs:float" name="height" use="optional"/> <xs:attribute type="xs:string" name="src" use="optional"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute type="xs:float" name="size" use="optional"/> <xs:attribute type="xs:string" name="face" use="optional"/> <xs:attribute type="xs:string" name="src" use="optional"/> <xs:attribute type="xs:string" name="color" use="optional"/> <xs:attribute type="xs:boolean" name="italic" use="optional"/> <xs:attribute type="xs:boolean" name="bold" use="optional"/> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute type="xs:short" name="width" use="optional"/> <xs:attribute type="xs:short" name="height" use="optional"/> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute type="xs:byte" name="pages"/> </xs:complexType> </xs:element> </xs:schema>
PDF to XML Conversion
The following code snippet shows the process of converting a PDF file to XML (MobiXML) format.
C#
// Load source PDF file Document doc = new Document(@"d:\document.pdf"); // Save output in XML format doc.Save("d:\outFile.xml", SaveFormat.MobiXml);
----------------------------------------------------
Convert PDF to PPTX
Skip to end of metadata
Go to start of metadata
| We have an API named Aspose.Slides which offers the feature to create as well as manipulate PPT/PPTX presentations. This API also provides the feature to convert PPT/PPTX files to PDF format. Recently we received requirement from many of our customers to support the capability of PDF transformation to PPTX format. Starting release of Aspose.Pdf for .NET 10.3.0, we have introduced a feature to transform PDF documents to PPTX format. During this conversion, the individual pages of the PDF file are converted to separate slide in PPTX file. |
C#
// Load PDF document Aspose.Pdf.Document doc = new Aspose.Pdf.Document(@"C:\pdftest\IN_7664539.pdf"); // Instantiate PptxSaveOptions instance Aspose.Pdf.PptxSaveOptions pptx_save = new Aspose.Pdf.PptxSaveOptions(); // Save the output in PPTX format doc.Save("c:/pdftest/IN_7664539.pptx", pptx_save);
--------------------------------------------