Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Excel performance through COM
Hi there,
I am writing COMAddIns and Automation AddIns for Excel 2007 (with C#). However, except a few sites, I cannot find any very helpful resources on performance. 1. Does anyone know where I can find a precise description what accessing the Excel Object Model through a COM interface really does? Am I accessing "proxy objects" thus in fact marshalling calls from my C# code to Excel? 2. When calling myRange.Value2 do I in fact marshal a call through the COM interface or not? 3. Question 2 leads to another question: Is is correct that using Worksheetfunction.sum(myRange, ...) (1 marshalled call for all cells in the range together) is much faster than using a loop and summing the cell values in my own C# code (1 marshalled call for every single cell in the range)? 4. When providing an Excel Range to a Worksheetfunction in my C# code, does Excel actually remarshal the Excel Range and send it back to the Excel application (e.g. copying all the values in the range) or does Excel only send the cell reference (e.g. "A1:B10") to Excel and let Excel then access the Range? Is there any way of doing something in my C# code like Worksheetfunction.sum("A1:B10", System.Reflection.Missing.Value, ...)? 5. Does shimming my COMAddIn/Automation AddIn actually increase or decrease performance (or doesn't it matter)? Thanks for your answers. Fabz |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Excel performance through COM
I can only comment on your Q.3
Most of Excel's worksheet functions are extremely efficient when called in cell formulas. However in code the call to the worksheetfunction library is relatively slow, possibly slower with your C# addin even than in VBA. If processing a small to medium qty of cells it's probably faster if you roll your own function. With a large number though even with delay of calling the function overall speed might be faster. Best way is to test with a simple Sum routine. Read and particularly Write to individual cells is slow. Generally faster to read the entire range to an array, do your stuff, and write back the result myArray = myRange.Value2 ' 2d array loop myArray For questions about Excel's object model you have come to the right place. You may also find it simpler to devise routines in Excel VBA before adapting to C#. For the rest of your questions, if you don't get answers here it might be worth asking in a forum more dedicated to C# addins for Excel. Regards, Peter T "Fabz" wrote in message ... Hi there, I am writing COMAddIns and Automation AddIns for Excel 2007 (with C#). However, except a few sites, I cannot find any very helpful resources on performance. 1. Does anyone know where I can find a precise description what accessing the Excel Object Model through a COM interface really does? Am I accessing "proxy objects" thus in fact marshalling calls from my C# code to Excel? 2. When calling myRange.Value2 do I in fact marshal a call through the COM interface or not? 3. Question 2 leads to another question: Is is correct that using Worksheetfunction.sum(myRange, ...) (1 marshalled call for all cells in the range together) is much faster than using a loop and summing the cell values in my own C# code (1 marshalled call for every single cell in the range)? 4. When providing an Excel Range to a Worksheetfunction in my C# code, does Excel actually remarshal the Excel Range and send it back to the Excel application (e.g. copying all the values in the range) or does Excel only send the cell reference (e.g. "A1:B10") to Excel and let Excel then access the Range? Is there any way of doing something in my C# code like Worksheetfunction.sum("A1:B10", System.Reflection.Missing.Value, ...)? 5. Does shimming my COMAddIn/Automation AddIn actually increase or decrease performance (or doesn't it matter)? Thanks for your answers. Fabz |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Excel performance through COM
Fabz,
Unfortunately the performance of c# addins is extremely poor with Excel. Much but not all of this is due to the COM Interop Layer. I am not a .NET expert but here are the results of my investigations so far As with VBA/VB6 you need to transfer data into and out of Excel in as large a block as possible, this line is the fastest .NET data transfer method I have found so far: vArr = DirectCast(vntTheList.Value2, Object(,)) Last Nonzero function timing comparison VBA 2.2 VB6 1.6 c# 13.6 Vb.net 8.6 here is the Last non-zero VB .NET function Public Function LNZVBNet(ByVal vntTheList As XL.Range) As Double Dim lCounter As Long Dim vArr As Object(,) LNZVBNet = 0 vArr = DirectCast(vntTheList.Value2, Object(,)) For lCounter = UBound(vArr, 1) To LBound(vArr, 1) Step -1 If vArr(lCounter, 1) < 0.0 Then LNZVBNet = vArr(lCounter, 1) Exit For Else End If Next lCounter End Function You will get much better performance (but still a lot slower than VBA/VB6) if you use one of the .Net Excel interface tools that use the C API/XLL interface (Addin Express, ExcelDNA ...) AverageTol function timing VBA 109 VB6 63 C XLL 37 Addin Express Automation VB.net 170 Addin Express XLL VB.net 100 ExcelDNA XLL CVB.Net 81 In VBA if you want to use Excel native worksheet functions it is much faster to keep the range as a range object rather than transfer it into an array. I would imagine that the same is true with .Net (but more so with .Net because the transfer is slower) Charles ___________________________________ The Excel Calculation Site http://www.decisionmodels.com "Fabz" wrote in message ... Hi there, I am writing COMAddIns and Automation AddIns for Excel 2007 (with C#). However, except a few sites, I cannot find any very helpful resources on performance. 1. Does anyone know where I can find a precise description what accessing the Excel Object Model through a COM interface really does? Am I accessing "proxy objects" thus in fact marshalling calls from my C# code to Excel? 2. When calling myRange.Value2 do I in fact marshal a call through the COM interface or not? 3. Question 2 leads to another question: Is is correct that using Worksheetfunction.sum(myRange, ...) (1 marshalled call for all cells in the range together) is much faster than using a loop and summing the cell values in my own C# code (1 marshalled call for every single cell in the range)? 4. When providing an Excel Range to a Worksheetfunction in my C# code, does Excel actually remarshal the Excel Range and send it back to the Excel application (e.g. copying all the values in the range) or does Excel only send the cell reference (e.g. "A1:B10") to Excel and let Excel then access the Range? Is there any way of doing something in my C# code like Worksheetfunction.sum("A1:B10", System.Reflection.Missing.Value, ...)? 5. Does shimming my COMAddIn/Automation AddIn actually increase or decrease performance (or doesn't it matter)? Thanks for your answers. Fabz |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Excel performance through COM
Okay, thanks for your help. I think I will first go the casual route and see
whether there will be performance problems occurring at all. But I am still wondering whether there is a COM specification for Excel. There surely must be something like this somewhere (otherwise, how are you supposed to build efficient AddIns?), but I cannot find it. "Fabz" wrote in message ... Hi there, I am writing COMAddIns and Automation AddIns for Excel 2007 (with C#). However, except a few sites, I cannot find any very helpful resources on performance. 1. Does anyone know where I can find a precise description what accessing the Excel Object Model through a COM interface really does? Am I accessing "proxy objects" thus in fact marshalling calls from my C# code to Excel? 2. When calling myRange.Value2 do I in fact marshal a call through the COM interface or not? 3. Question 2 leads to another question: Is is correct that using Worksheetfunction.sum(myRange, ...) (1 marshalled call for all cells in the range together) is much faster than using a loop and summing the cell values in my own C# code (1 marshalled call for every single cell in the range)? 4. When providing an Excel Range to a Worksheetfunction in my C# code, does Excel actually remarshal the Excel Range and send it back to the Excel application (e.g. copying all the values in the range) or does Excel only send the cell reference (e.g. "A1:B10") to Excel and let Excel then access the Range? Is there any way of doing something in my C# code like Worksheetfunction.sum("A1:B10", System.Reflection.Missing.Value, ...)? 5. Does shimming my COMAddIn/Automation AddIn actually increase or decrease performance (or doesn't it matter)? Thanks for your answers. Fabz |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Excel performance through COM
What is very interesting is the comparison of the following two UDFs
(written as Automation AddIn in C#), iterating through a cell range of 10'000 cells: public double LoopExample1(Excel.Range range) { DateTime before = DateTime.Now; foreach (Excel.Range r in range) { double d = (double)r.Value2 + 1; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } public double LoopExample2(Excel.Range range) { DateTime before = DateTime.Now; Object[,] rng = range.get_Value(System.Reflection.Missing.Value) as Object[,]; foreach (Object obj in rng) { double d = (double)obj + 1; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } public double LoopExample3(Excel.Range range) { DateTime before = DateTime.Now; foreach (Excel.Range r in range.Cells) { double d = (double)r.Value2 + 1; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } On my machine, the first UDF (LoopExample1) requires around 375 milliseconds to finish, the (LoopExample2) around 16 milliseconds and the third (LoopExample3) around 359 milliseconds! Thus, it seems to be much faster to actually use a 2D-Object array instead of going through the implicit enumerator inside the foreach loop. (This website seems to confirm these assumptions: http://dotnetperls.com/excel-interop.) Be aware that range.get_Value does not belong to the official API specs published for the Range interface by MS. |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Excel performance through COM
Yes, as we said, getting the data out of excel into an array in as large a
block a possible is the way to go. But .Net is still very slow compared to VBA and VB6 and C++. Although you can speed up the .Net performance by going through the XLL/C API interface Charles ___________________________________ The Excel Calculation Site http://www.decisionmodels.com "Fabz" wrote in message ... What is very interesting is the comparison of the following two UDFs (written as Automation AddIn in C#), iterating through a cell range of 10'000 cells: public double LoopExample1(Excel.Range range) { DateTime before = DateTime.Now; foreach (Excel.Range r in range) { double d = (double)r.Value2 + 1; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } public double LoopExample2(Excel.Range range) { DateTime before = DateTime.Now; Object[,] rng = range.get_Value(System.Reflection.Missing.Value) as Object[,]; foreach (Object obj in rng) { double d = (double)obj + 1; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } public double LoopExample3(Excel.Range range) { DateTime before = DateTime.Now; foreach (Excel.Range r in range.Cells) { double d = (double)r.Value2 + 1; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } On my machine, the first UDF (LoopExample1) requires around 375 milliseconds to finish, the (LoopExample2) around 16 milliseconds and the third (LoopExample3) around 359 milliseconds! Thus, it seems to be much faster to actually use a 2D-Object array instead of going through the implicit enumerator inside the foreach loop. (This website seems to confirm these assumptions: http://dotnetperls.com/excel-interop.) Be aware that range.get_Value does not belong to the official API specs published for the Range interface by MS. |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Excel performance through COM
I did some more tests, summing up 20'000 doubles:
public double SumExample1(Excel.Range range) { DateTime before = DateTime.Now; Excel.WorksheetFunction wsFunc = range.Application.WorksheetFunction; double d = wsFunc.Sum(range, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value, System.Reflection.Missing.Value); DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } public double SumExample2(Excel.Range range) { DateTime before = DateTime.Now; Object[,] rng = range.get_Value(System.Reflection.Missing.Value) as Object[,]; double d = 0; foreach (Object obj in rng) { d += (double)obj; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } The first eaample finished in 47 milliseconds, the second in 16 milliseconds. However, one should be careful with interpreting these results: Worksheetfunctions that compute complicated mathematical results might possibly end up being computed faster than computing the same results in C# code (because of various internal optimizations). However, I have not tried this myself. Fabz "Fabz" wrote in message ... What is very interesting is the comparison of the following two UDFs (written as Automation AddIn in C#), iterating through a cell range of 10'000 cells: public double LoopExample1(Excel.Range range) { DateTime before = DateTime.Now; foreach (Excel.Range r in range) { double d = (double)r.Value2 + 1; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } public double LoopExample2(Excel.Range range) { DateTime before = DateTime.Now; Object[,] rng = range.get_Value(System.Reflection.Missing.Value) as Object[,]; foreach (Object obj in rng) { double d = (double)obj + 1; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } public double LoopExample3(Excel.Range range) { DateTime before = DateTime.Now; foreach (Excel.Range r in range.Cells) { double d = (double)r.Value2 + 1; } DateTime after = DateTime.Now; TimeSpan duration = after - before; return duration.TotalMilliseconds; } On my machine, the first UDF (LoopExample1) requires around 375 milliseconds to finish, the (LoopExample2) around 16 milliseconds and the third (LoopExample3) around 359 milliseconds! Thus, it seems to be much faster to actually use a 2D-Object array instead of going through the implicit enumerator inside the foreach loop. (This website seems to confirm these assumptions: http://dotnetperls.com/excel-interop.) Be aware that range.get_Value does not belong to the official API specs published for the Range interface by MS. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
excel sub-par performance | Excel Discussion (Misc queries) | |||
Excel Speed Performance help | Excel Discussion (Misc queries) | |||
Excel 2002 performance | Excel Discussion (Misc queries) | |||
Excel Performance | Excel Programming | |||
Excel Performance Problems | Excel Programming |