Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Automating Linear Interpolation
Hi there - thanks for looking, I have a large dataset with various columns of data for global climate (meteorological) stations, however the dataset is not complete - it has gaps for some years where data has not been collected/the equipment failed. I am hoping to be able to interpolate the data in order to fill these gaps... via a simple linear method. For example, if there are three cells as shown below, then the gap in cell B can be filled by taking the average of cells A and C to equal 50 etc etc. A = 40 B = [GAP] (=50 when interpolated/averaged) C = 60 There are also incidents where there are a series of gaps in succession, eg. as follows: A = 10 B = GAP C = GAP D = GAP E = 50 I understand that it might be difficult to fill the gaps whereby the values increment in turn (ie. = 20, 30, 40 respectively in the example above) - but would it be possible to design an automated solution whereby the missing data is filled by doing, say, (50+10)/2 = 30... and '30' is used to fill the gaps from B-D (there are not necessary 3 blanks in each case, this ranges considerably). The final piece of the jigsaw is in averaging the values correctly when the climate station changes and there are gaps for both stations in succession, as demonstrated in the simplified example below: Station 1: A = 10 Station 1: B = 15 Station 1: C = 20 Station 1: D = GAP (=25) Station 1: E = 30 Station 1: F = GAP (=35) Station 1: G = GAP (=40) Station 2: A = GAP (=4) Station 2: B = GAP (=5) Station 2: C = 6 Station 2: D = 7 Station 2: E = GAP (=8) Station 2: F = 9 Station 2: G = 10 In the case of Station 1: F-G and Station 2: A-B, an average is not required of the values above and below the gap present (ie. not taking the average of 30 and 6 and filling each of the gaps with 13), but is required only for the data in that particular station. So would there be a way of informing the to be wary of such changes in station reference? (Station reference number is in column A in my dataset) Is there a way of designing (a macro in Visual Basic?) to automate these procedures in Excel 2002? - as doing it manually would be unfeasible due to the nature of the dataset size. Incidently, data which requires interpolation is in columns F to M (inclusive), and is represented as a blank cell. Interpolation is required vertically down the columns, and not between rows. Thanks very much for looking at this - I would be very grateful if someone manages to manufacture a solution. Thanks once more, Steve Murray -- smurray444 ------------------------------------------------------------------------ smurray444's Profile: http://www.excelforum.com/member.php...o&userid=28956 View this thread: http://www.excelforum.com/showthread...hreadid=504517 |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Automating Linear Interpolation
Steve,
Assuming that your step in the Time dimension is always consistent, your data is sorted based on the identifier (station 1, etc.), and that your identifier is in column A - try the macro below, on a copy of your data. HTH, Bernie MS Excel MVP Sub FillInBlanks() Dim myCell As Range Dim i As Integer Dim myRange As Range Set myRange = Range("A:A") For i = 6 To 13 ' Interpolation First For Each myCell In Columns(i).SpecialCells(xlCellTypeBlanks) With myRange If (.Cells(myCell.Row).Value = _ .Cells(myCell.End(xlUp).Row).Value) And _ (.Cells(myCell.Row).Value = _ .Cells(myCell.End(xlDown).Row).Value) Then myCell.Value = myCell.End(xlDown).Value + _ (myCell.Row - myCell.End(xlDown).Row) _ * (myCell.End(xlUp).Value - myCell.End(xlDown).Value) / _ (myCell.End(xlUp).Row - myCell.End(xlDown).Row) End If End With Next myCell 'Now Extrapolation For Each myCell In Columns(i).SpecialCells(xlCellTypeBlanks) 'Interpolate if a cell above has been filled in With myRange If (.Cells(myCell.Row).Value = _ .Cells(myCell.End(xlUp).Row).Value) And _ (.Cells(myCell.Row).Value = _ .Cells(myCell.End(xlDown).Row).Value) Then myCell.Value = myCell.End(xlDown).Value + _ (myCell.Row - myCell.End(xlDown).Row) _ * (myCell.End(xlUp).Value - myCell.End(xlDown).Value) / _ (myCell.End(xlUp).Row - myCell.End(xlDown).Row) ElseIf (.Cells(myCell.Row).Value < _ .Cells(myCell.End(xlUp).Row).Value) Then myCell.Value = myCell.End(xlDown).Value + _ (myCell.Row - myCell.End(xlDown).Row) _ * (myCell.End(xlDown)(2).Value - myCell.End(xlDown).Value) / _ (myCell.End(xlDown)(2).Row - myCell.End(xlDown).Row) Else myCell.Value = myCell.End(xlUp).Value + _ (myCell.Row - myCell.End(xlUp).Row) _ * (myCell.End(xlUp)(0).Value - myCell.End(xlUp).Value) / _ (myCell.End(xlUp)(0).Row - myCell.End(xlUp).Row) End If End With Next myCell Next i End Sub "smurray444" wrote in message ... Hi there - thanks for looking, I have a large dataset with various columns of data for global climate (meteorological) stations, however the dataset is not complete - it has gaps for some years where data has not been collected/the equipment failed. I am hoping to be able to interpolate the data in order to fill these gaps... via a simple linear method. For example, if there are three cells as shown below, then the gap in cell B can be filled by taking the average of cells A and C to equal 50 etc etc. A = 40 B = [GAP] (=50 when interpolated/averaged) C = 60 There are also incidents where there are a series of gaps in succession, eg. as follows: A = 10 B = GAP C = GAP D = GAP E = 50 I understand that it might be difficult to fill the gaps whereby the values increment in turn (ie. = 20, 30, 40 respectively in the example above) - but would it be possible to design an automated solution whereby the missing data is filled by doing, say, (50+10)/2 = 30... and '30' is used to fill the gaps from B-D (there are not necessary 3 blanks in each case, this ranges considerably). The final piece of the jigsaw is in averaging the values correctly when the climate station changes and there are gaps for both stations in succession, as demonstrated in the simplified example below: Station 1: A = 10 Station 1: B = 15 Station 1: C = 20 Station 1: D = GAP (=25) Station 1: E = 30 Station 1: F = GAP (=35) Station 1: G = GAP (=40) Station 2: A = GAP (=4) Station 2: B = GAP (=5) Station 2: C = 6 Station 2: D = 7 Station 2: E = GAP (=8) Station 2: F = 9 Station 2: G = 10 In the case of Station 1: F-G and Station 2: A-B, an average is not required of the values above and below the gap present (ie. not taking the average of 30 and 6 and filling each of the gaps with 13), but is required only for the data in that particular station. So would there be a way of informing the to be wary of such changes in station reference? (Station reference number is in column A in my dataset) Is there a way of designing (a macro in Visual Basic?) to automate these procedures in Excel 2002? - as doing it manually would be unfeasible due to the nature of the dataset size. Incidently, data which requires interpolation is in columns F to M (inclusive), and is represented as a blank cell. Interpolation is required vertically down the columns, and not between rows. Thanks very much for looking at this - I would be very grateful if someone manages to manufacture a solution. Thanks once more, Steve Murray -- smurray444 ------------------------------------------------------------------------ smurray444's Profile: http://www.excelforum.com/member.php...o&userid=28956 View this thread: http://www.excelforum.com/showthread...hreadid=504517 |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Linear Interpolation | Excel Worksheet Functions | |||
Linear Interpolation UDF | Excel Discussion (Misc queries) | |||
Automating Linear Interpolation | Excel Discussion (Misc queries) | |||
Linear Interpolation | Excel Programming | |||
linear interpolation | Excel Discussion (Misc queries) |