Tuesday, 15 March 2016

All Type of File Convertion with Dotnet.FW

* Pdf to PPT
 
 // Load PDF document
            Aspose.Pdf.Document doc = new Aspose.Pdf.Document(@"C:\pdftest\IN_7664539.pdf");
            // Instantiate PptxSaveOptions instance
            Aspose.Pdf.PptxSaveOptions pptx_save = new Aspose.Pdf.PptxSaveOptions();
            // Save the output in PPTX format
            doc.Save("c:/pdftest/IN_7664539.pptx", pptx_save);
 
 
 
 
 
 
----------------------------------------------------- 
 
* Pdf to HTML 
 
public static void SavingOfAllPageHtmlsApart()
{
    Document doc = new Document(@"C:\PDFTest\NimbusSRP.pdf");

    // Pay attention that we put non-existing path here : since we use custon resource processing it won't be in use.
    // If You forget implement some of required saving strategies(CustomHtmlSavingStrategy,CustomResourceSavingStrategy,CustomCssSavingStrategy), then saving will return "Path not found" exception
    string outHtmlFile = @"T:\SomeNonExistingFolder\NimbusSRP.html";

    // Create HtmlSaveOption with custom saving strategies that will do all the saving job
    // in such approach You can split HTML in pages if You will
    HtmlSaveOptions saveOptions = new HtmlSaveOptions();
    saveOptions.SplitIntoPages = true;

    saveOptions.CustomHtmlSavingStrategy = new HtmlSaveOptions.HtmlPageMarkupSavingStrategy(StrategyOfSavingHtml);
    saveOptions.CustomResourceSavingStrategy = new HtmlSaveOptions.ResourceSavingStrategy(CustomSaveOfFontsAndImages);
    saveOptions.CustomStrategyOfCssUrlCreation = new HtmlSaveOptions.CssUrlMakingStrategy(CssUrlMakingStrategy);
    saveOptions.CustomCssSavingStrategy = new HtmlSaveOptions.CssSavingStrategy(CustomSavingOfCss);

    saveOptions.FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats;
    saveOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
    doc.Save(outHtmlFile, saveOptions);

    Console.WriteLine("Done");
    Console.ReadLine();
}
 
 
*PDF to excel 
// Load PDF document
Aspose.Pdf.Document doc = new Aspose.Pdf.Document(@"C:\input.pdf");
// Instantiate ExcelSave Option object
Aspose.Pdf.ExcelSaveOptions excelsave = new ExcelSaveOptions();
// Save the output in XLS format
doc.Save("c:/Resultant.xls", excelsave); 
------------------------------------------------
** Text file to Pdf file 
// Read the source text file
TextReader tr = new StreamReader(myDir + "Formtext.txt");

// Instantiate a Document object by calling its empty constructor
Document doc = new Document();

// Add a new page in Pages collection of Document
Page page = doc.Pages.Add();

// Create an instance of TextFragmet and pass the text from reader object to its constructor as argument
TextFragment text = new TextFragment(tr.ReadToEnd());
//text.TextState.Font = FontRepository.FindFont("Arial Unicode MS");
                
// Add a new text paragraph in paragraphs collection and pass the TextFragment object
page.Paragraphs.Add(text);

// Save resultant PDF file
doc.Save(myDir+"TexttoPDF.pdf"); 
---------------------------------------------------------
 
XML
<?xml version="1.0" encoding="utf-8" ?>
<Document xmlns="Aspose.Pdf">
  <Page id="mainSection">
    <TextFragment>
      <TextSegment id="boldHtml">segment1</TextSegment>
    </TextFragment>

    <TextFragment>
      <TextSegment id="strongHtml">segment2</TextSegment>
    </TextFragment>
  </Page>
</Document>
 ----------------------------------------------------
C#
// instantiate Document object
Document doc = new Document();
// bind source XML file
doc.BindXml("source.xml");
// get reference of page object from XML
Page page = (Page)doc.GetObjectById("mainSection");
// get reference of first TextSegment with ID boldHtml
TextSegment segment = (TextSegment)doc.GetObjectById("boldHtml");
// get reference of second TextSegment with ID strongHtml
segment = (TextSegment)doc.GetObjectById("strongHtml");
// save resultant PDF file
doc.Save("Resultant.pdf");
 -----------------------------------------------
***PDF to Excel 
C#
// Load PDF document
Aspose.Pdf.Document doc = new Aspose.Pdf.Document(@"C:\input.pdf");
// Instantiate ExcelSave Option object
Aspose.Pdf.ExcelSaveOptions excelsave = new ExcelSaveOptions();
// Save the output in XLS format
doc.Save("c:/Resultant.xls", excelsave);
-----------------------------------------

Schema

The schema is extended with the ability to use external fonts. Furthermore, when converting PDF files to XML, images are represented as separate files in the same directory as the output XML is created. Fonts are represented as TrueType fonts and the corresponding files (filename_fontN.ttf) are created along with the output XML.
XML is formed in accordance with the DTD schema specified below:
XML
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="pdf2xml">
    <xs:complexType>
      <xs:sequence>
        <xs:element type="xs:string" name="title"/>
        <xs:element name="page" maxOccurs="unbounded" minOccurs="0">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="font" maxOccurs="unbounded" minOccurs="0">
                <xs:complexType>
                  <xs:sequence>
                    <xs:element name="text" maxOccurs="unbounded" minOccurs="0">
                      <xs:complexType>
                        <xs:simpleContent>
                          <xs:extension base="xs:string">
                            <xs:attribute type="xs:float" name="x" use="optional"/>
                            <xs:attribute type="xs:float" name="y" use="optional"/>
                            <xs:attribute type="xs:float" name="width" use="optional"/>
                            <xs:attribute type="xs:float" name="height" use="optional"/>
                          </xs:extension>
                        </xs:simpleContent>
                      </xs:complexType>
                    </xs:element>
                    <xs:element name="img" minOccurs="0">
                      <xs:complexType>
                        <xs:simpleContent>
                          <xs:extension base="xs:string">
                            <xs:attribute type="xs:float" name="x" use="optional"/>
                            <xs:attribute type="xs:float" name="y" use="optional"/>
                            <xs:attribute type="xs:float" name="width" use="optional"/>
                            <xs:attribute type="xs:float" name="height" use="optional"/>
                            <xs:attribute type="xs:string" name="src" use="optional"/>
                          </xs:extension>
                        </xs:simpleContent>
                      </xs:complexType>
                    </xs:element>
                  </xs:sequence>
                  <xs:attribute type="xs:float" name="size" use="optional"/>
                  <xs:attribute type="xs:string" name="face" use="optional"/>
                  <xs:attribute type="xs:string" name="src" use="optional"/>
                  <xs:attribute type="xs:string" name="color" use="optional"/>
                  <xs:attribute type="xs:boolean" name="italic" use="optional"/>
                  <xs:attribute type="xs:boolean" name="bold" use="optional"/>
                </xs:complexType>
              </xs:element>
            </xs:sequence>
            <xs:attribute type="xs:short" name="width" use="optional"/>
            <xs:attribute type="xs:short" name="height" use="optional"/>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
      <xs:attribute type="xs:byte" name="pages"/>
    </xs:complexType>
  </xs:element>
</xs:schema>

PDF to XML Conversion

The following code snippet shows the process of converting a PDF file to XML (MobiXML) format.
C#
// Load source PDF file
Document doc = new Document(@"d:\document.pdf");
// Save output in XML format
doc.Save("d:\outFile.xml", SaveFormat.MobiXml);
 
 ----------------------------------------------------

Convert PDF to PPTX

Skip to end of metadata
Go to start of metadata
We have an API named Aspose.Slides which offers the feature to create as well as manipulate PPT/PPTX presentations. This API also provides the feature to convert PPT/PPTX files to PDF format. Recently we received requirement from many of our customers to support the capability of PDF transformation to PPTX format. Starting release of Aspose.Pdf for .NET 10.3.0, we have introduced a feature to transform PDF documents to PPTX format. During this conversion, the individual pages of the PDF file are converted to separate slide in PPTX file.
During PDF to PPTX conversion, the text is rendered as Text where you can select/update it, instead its rendered as image. Please note that in order to convert PDF files to PPTX format, Aspose.Pdf provides a class named PptxSaveOptions. An object of the PptxSaveOptions class is passed as a second argument to the Document.Save(..) method. The following code snippet shows the process for converting PDF file into PPTX format.

C#
// Load PDF document
Aspose.Pdf.Document doc = new Aspose.Pdf.Document(@"C:\pdftest\IN_7664539.pdf");
// Instantiate PptxSaveOptions instance
Aspose.Pdf.PptxSaveOptions pptx_save = new Aspose.Pdf.PptxSaveOptions();
// Save the output in PPTX format
doc.Save("c:/pdftest/IN_7664539.pptx", pptx_save);
 --------------------------------------------