Advertisment

How to convert XLS files to XML?

author-image
CIOL Bureau
Updated On
New Update

This a really useful tool for organizations that have done Enterprise Application Integration using Microsoft BizTalk Server. It helps convert XLS files into XML, which can then be integrated into a BizTalk orchestration. A lot of data in organizations lies in Excel files, and it's a challenge to integrate them with other applications.So even if you've integrated multiple business applications using a product like BizTalk Server, taking data from Excel files and using them for these applications can be quite challenging. However, if that data is converted into XML, then it can be read not only by BizTalk, but even by other applications.

Advertisment

In this article, we'll talk about a tool called FarPoint Spread for BizTalk Server 2006.It has a 'Spread for BizTalk' component that gets into Visual Studio 2005 BizTalk project, allowing you to create a schema, which in turn determines mapping of the resulting XML stream. At run time itself the Excel data is parsed and disassembled into XML, and that is mapped with the schema that you have created with the 'Spread for BizTalk' wizard. Hence, the 'Spread for BizTalk server 2006' component has automated the parsing of XLS message stream into XML message stream. Before using the tool, it's assumed that you already have BizTalk Server running in your organization, and you also have Visual Studio 2005. Let's see how to use the 'Spread for BizTalk' wizard.

Direct Hit!

Applies To: Database Administrators and IT Managers

USP: Creating customized schema for BizTalk server

Primary Link: www.fpoint.com?

Google Keywords: Spread for BizTalk

Creating a schema

Take any Excel file. Now, we will extract data from it and convert it into an XML stream. To start with, open Microsoft Visual Studio 2005 and start an empty BizTalk Server project. Add 'Receive pipeline;' this can be done by adding new item to your project. In the toolbox, under the BizTalk Server components, you will see the 'Spreadsheet Disassembler' component. Drag this component and drop it in the disassembler stage of your receive pipeline. Note, this is the component of 'Spread for BizTalk' and you can only set the 'Document Schema' property of this component. 'Document Schema' is used to define the schema that should be used to map the incoming stream.

Advertisment
 
Microsoft XLS data is converted to XML stream when parsed through 'Spread for BizTalk' component

Now, add new schema file to your project, which can be done by adding new item. Select 'Spreadsheet Schema Wizard', which will finally invoke the schema wizard. Browse the Excel file from the wizard that you want to use. You can also set target namespace and the code page, which is basically used to encode the resulting XML. Proceed to the next step. You can choose any additional settings and apply them to the resulting schema. For example, you can remove the hidden columns that are on the Excel sheet and proceed to the next step.

Advertisment

Now, the Excel file is loaded into the wizard and you can select the data that you want to use for conversion into XML stream. Exclude the Header row, Trailer row and the Spacer row, because we are not interested in them. Now, select the data that you want to use, and click on the 'Next' button to proceed further. In the next step, you will find that only the selected data is shown.

Moreover, 'Spread for BizTalk' automatically detects whether there was a header on the Excel sheet and then it automatically includes this header as Field name for the respective columns. You can modify these headers, if you want. It shows you the Field type, Data type and data as well. The 'Field Types' are the standard XML field types that are 'Element' and 'Attribute'. The wizard automatically determines an appropriate data type and sets it for each column. Now, finally click on the 'Finish' button to end the process.

 
Advertisment
Select the XLS data you are interested in, leaving the

Header row, Spacer row, and the Trailer row

When the wizard closes, a schema file is generated. Have a look at the generated schema in the figure below. Suppose you had two sheets in the Excel file and you did not select anything from the other sheet, even then you see that it is mapped. Why?

This is because the spread does not allow you to do destructive reading of the incoming data, and so it needs to persist data throughout the process. And thus, its element name is given as unmapped data.

Advertisment

Now, look at the property of the schema that we have just created. It shows you the input instance filename that was used to create the schema. Right click on the schema file in the solution explorer, and select 'Validate Schema.' In the output window you can find the path of the XML file that is generated during the validation process. What actually happens, while validating an instance is that an XLS file is parsed and XML is created. Then the XML is validated against the schema.

To persist the data throughout the process, the

 'Spread for BizTalk' wizard maps the data to the

 resulting XML stream

Now go back to the receive pipeline, and open the property of the 'Spreadsheet Disassembler', which is in Disassembler stage. Now add the document schema that we have created and rebuild the solution. And finally, deploy the solution to 'BizTalk Server'.

tech-news