(NET) NET (2018)

Processing Office Open XML (xlsx) files by EPPlus.

There are some way to read Excel file - by Microsoft.Jet.OLEDB.4.0 & Microsoft.ACE.OLEDB - Cайтік з web-сервісом, спеціфічним membership-провайдером та даними Excel. and by native Excel - Шаблон кода для роботи з MS Excel.. But in this page I describe modern and best way - by EPPlus.

Write XLSX

There are two direction to use EPPlus. First direction is to process data in Datatable structure, and eventually save data to XLSX format. In this case you can be read firstly CSV by Processing CSV files by CsvHelper to Datatable. Below you can see this pattern.


   7:  Imports OfficeOpenXml
 ...   
 543:      Sub SaveDtToExcel(DTF As DataTable, ExcelFullName As FileInfo)
 544:          Dim Excel As ExcelPackage = New ExcelPackage()
 545:          Dim WorkSheet As ExcelWorksheet = Excel.Workbook.Worksheets.Add("Sheet 1")
 546:          WorkSheet.Cells("A1").LoadFromDataTable(DTF, True)
 547:          Excel.SaveAs(ExcelFullName)
 548:      End Sub
 ...   
 669:      Private Sub BGW5_DoWork(sender As Object, e As DoWorkEventArgs) Handles BGW5.DoWork
 670:          Dim DT1 As DataTable = ReadCSV(IO.Path.Combine(SelectedTempPath, WestlawSelectedListViewFileName))
 671:          Dim DV2 As DataView = DT1.DefaultView
 672:          DV2.Sort() = e.Argument & " asc"
 673:          Dim DT2 As DataTable = DV2.ToTable
 674:          '
 675:          Dim AttachmentFileNameInfo As New FileInfo(IO.Path.Combine(SelectedTempPath, WestlawSelectedListViewFileName & ".xlsx"))
 676:          SaveDtToExcel(DT2, AttachmentFileNameInfo)
 677:          WestLawAttachmentFileName = AttachmentFileNameInfo.FullName
 678:      End Sub

Read XLSX as untyped values.

In this form I read Excel file, show something fields and processing another fields.



This is template of this form.


   1:  Imports System.IO
   2:  Imports OfficeOpenXml
   3:  Imports Microsoft.VisualBasic
   4:   
   5:  Public Class SearchAddress
   6:      Public Property FileName As String
   7:      Dim FI As FileInfo
   8:      Dim WorkBook As ExcelWorkbook
   9:      Dim WorkSheet As ExcelWorksheet
  10:      Dim ColCount As Integer
  11:      Dim RowCount As Integer
  12:      Dim CurRow As Integer
  13:   
  14:   
  15:      Private Sub SearchAddress_Load(sender As Object, e As EventArgs) Handles Me.Load
 ...   
  22:              FI = New FileInfo(FileName)
  23:              Dim Package = New ExcelPackage(FI)
  24:              WorkBook = Package.Workbook
  25:              WorkSheet = WorkBook.Worksheets.First()
  26:              'For i As Integer = 1 To RowCount
  27:              '    Debug.Print(WorkSheet.Cells(i, 1).Value.ToString())
  28:              'Next
  29:              ColCount = WorkSheet.Dimension.End.Column
  30:              RowCount = WorkSheet.Dimension.End.Row
  31:              CurRow = 2
  32:              ShowRowInfo(2)
 ...   
  38:      End Sub
 ...   
  40:      Sub ShowRowInfo(RowNumber As Integer)
  41:          CurRowsLabel.Text = GetCurRowLabel(CurRow)
 ...   
  48:          AddressTextBox.SelectText(StreetTextBox.Text & "," & CityTextBox.Text & "," & StateComboBox.SelectedValue & "," & ZipTextBox.Text)
  49:      End Sub
 ...   
  51:      Dim RepeatRows As List(Of String)
  52:      Function GetCurRowLabel(RowNumber As Integer) As String
  53:          RepeatRows = New List(Of String)
  54:          Dim Str1 As New Text.StringBuilder
  55:          For i As Integer = 1 To RowCount
  56:              If WorkSheet.Cells(i, 1).Value = WorkSheet.Cells(RowNumber, 1).Value Then
  57:                  RepeatRows.Add(i)
  58:                  Str1.Append(i & ",")
  59:              End If
  60:          Next
  61:          Return Microsoft.VisualBasic.Left(Str1.ToString, Str1.Length - 1) & " / " & RowCount
  62:      End Function

Read XLSX as typed data.

Pay attention how I can access to untyped data - only as object - WorkSheet.Cells(RowNumber, ColNumber).Value, if you have access to typed data, you need sophisticated converter object value to named typed fields.

In this case you need firstly define class ExcelData with definition of each fields of you Excel file. Then create extension below and call it:


  82:  Dim XSLData = WorkSheet.Tables.First().ConvertTableToObjects(of ExcelData)()

   1:  Imports OfficeOpenXml
   2:  Imports OfficeOpenXml.Table
   3:  Imports System.Runtime.CompilerServices
   4:   
   5:  Module ExcelExtensions
   6:   
   7:      <Extension()>
   8:      Public Function ConvertTableToObjects(Of T As New)(ByVal Table As ExcelTable) As IEnumerable(Of T)
   9:          Dim ConvertDateTime = New Func(Of Double, DateTime)(Function(ExcelDate)
  10:                                                                  If ExcelDate < 1 Then Throw New ArgumentException("Excel dates cannot be smaller than 0.")
  11:                                                                  Dim DateOfReference = New DateTime(1900, 1, 1)
  12:                                                                  If ExcelDate > 60.0R Then
  13:                                                                      ExcelDate = ExcelDate - 2
  14:                                                                  Else
  15:                                                                      ExcelDate = ExcelDate - 1
  16:                                                                  End If
  17:                                                                  Return DateOfReference.AddDays(ExcelDate)
  18:                                                              End Function)
  19:          Dim Tprops = (New T()).GetType().GetProperties().ToList()
  20:          Dim Groups = Table.WorkSheet.Cells(Table.Address.Start.Row, Table.Address.Start.Column, Table.Address.End.Row, Table.Address.End.Column).GroupBy(Function(Cell) Cell.Start.Row).ToList()
  21:          Dim Types = Groups.Skip(1).First().Select(Function(Rcell) Rcell.Value.GetType()).ToList()
  22:          Dim Colnames = Groups.First().Select(Function(Hcell, idx) New With {.Name = Hcell.Value.ToString(), .index = idx}).Where(Function(O) Tprops.Select(Function(P) P.Name).Contains(O.Name)).ToList()
  23:          Dim Rowvalues = Groups.Skip(1).Select(Function(Cg) Cg.Select(Function(c) c.Value).ToList())
  24:          Dim Collection = Rowvalues.Select(Function(Row)
  25:                                                Dim Tnew = New T()
  26:                                                Colnames.ForEach(Function(ColName)
  27:                                                                     Dim Val = Row(ColName.index)
  28:                                                                     Dim Type = Types(ColName.index)
  29:                                                                     Dim Prop = Tprops.First(Function(p) p.Name = ColName.Name)
  30:                                                                     If Type = GetType(Double) Then
  31:                                                                         Dim UnboxedVal = CDbl(Val)
  32:                                                                         If Prop.PropertyType = GetType(Int32) Then
  33:                                                                             Prop.SetValue(Tnew, CInt(UnboxedVal))
  34:                                                                         ElseIf Prop.PropertyType = GetType(Double) Then
  35:                                                                             Prop.SetValue(Tnew, UnboxedVal)
  36:                                                                         ElseIf Prop.PropertyType = GetType(DateTime) Then
  37:                                                                             Prop.SetValue(Tnew, ConvertDateTime(UnboxedVal))
  38:                                                                         Else
  39:                                                                             Throw New NotImplementedException(String.Format("Type '{0}' not implemented yet!", Prop.PropertyType.Name))
  40:                                                                         End If
  41:                                                                     Else
  42:                                                                         Prop.SetValue(Tnew, Val)
  43:                                                                     End If
  44:                                                                 End Function)
  45:                                                Return Tnew
  46:                                            End Function)
  47:          Return Collection
  48:      End Function
  49:  End Module

Last function is not a my classic VB-style of programming, I publishing this style of VB.NET only for fun.

More details.

More details examples to use EPplus (extension function and check if needed column is exist) please see in page Multithreading Parsers with Parallel, CsQuery, Newtonsoft.Json, OfficeOpenXml and IAsyncResult/AsyncCallback..



Comments ( )
<00>  <01>  <02>  <03>  <04>  <05>  <06>  <07>  <08>  <09>  <10>  <11>  <12>  <13>  <14>  <15>  <16>  <17>  <18>  <19
Link to this page: http://www.vb-net.com/EPPlus-Office-Open-XML/index.htm
<SITEMAP>  <MVC>  <ASP>  <NET>  <DATA>  <KIOSK>  <FLEX>  <SQL>  <NOTES>  <LINUX>  <MONO>  <FREEWARE>  <DOCS>  <ENG>  <MAIL ME>  <ABOUT ME>  < THANKS ME>