ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Data Transformation (https://www.excelbanter.com/excel-programming/344196-data-transformation.html)

zaisaki[_3_]

Data Transformation
 

Hi All,
I have a data set with variables id, a, b, c, d and looks like:


1 2 3 5 1
1 5 6 5 3
1 2 4 2 4
2 1 3 4 5
2 5 6 2 3
2 4 5 2 1


I would like to summarize data for each id and create a new variable
representing each code occuring in variables a, b and c.


Essentially the new data set should have variables id, v1 - v6.
v1 represents the number of times code 1 occurs in variables a, b and

in the original data set, and so on for v2 - v6.


The data set would like;
1 1 3 2 2 3 1
2 2 2 2 2 1 1


Your help is much appreciated.


Thanks very much,
Zai Sak

--
zaisak
-----------------------------------------------------------------------
zaisaki's Profile: http://www.excelforum.com/member.php...fo&userid=1375
View this thread: http://www.excelforum.com/showthread.php?threadid=48039


Tom Ogilvy

Data Transformation
 
Sub AB()
Dim rng As Range
Dim cell As Range
Dim arr() As Long
Dim ub As Long
Set rng = Range("A1").CurrentRegion.Resize(, 4).Cells
ub = Application.Max(rng.Columns(1))
ReDim arr(1 To ub, 0 To 6)
For Each cell In rng
If cell.Column = 1 Then
Id = cell.Value
arr(Id, 0) = Id
Else
arr(Id, cell.Value) = arr(Id, cell.Value) + 1
End If
Next
Range("H1").Resize(ub, 7).Value = arr

End Sub

--
Regards,
Tom Ogilvy




"zaisaki" wrote in
message ...

Hi All,
I have a data set with variables id, a, b, c, d and looks like:


1 2 3 5 1
1 5 6 5 3
1 2 4 2 4
2 1 3 4 5
2 5 6 2 3
2 4 5 2 1


I would like to summarize data for each id and create a new variable
representing each code occuring in variables a, b and c.


Essentially the new data set should have variables id, v1 - v6.
v1 represents the number of times code 1 occurs in variables a, b and c

in the original data set, and so on for v2 - v6.


The data set would like;
1 1 3 2 2 3 1
2 2 2 2 2 1 1


Your help is much appreciated.


Thanks very much,
Zai Saki


--
zaisaki
------------------------------------------------------------------------
zaisaki's Profile:

http://www.excelforum.com/member.php...o&userid=13755
View this thread: http://www.excelforum.com/showthread...hreadid=480390




Harlan Grove

Data Transformation
 
"zaisaki" wrote...
I have a data set with variables id, a, b, c, d and looks like:

1 2 3 5 1
1 5 6 5 3
1 2 4 2 4
2 1 3 4 5
2 5 6 2 3
2 4 5 2 1

I would like to summarize data for each id and create a new variable
representing each code occuring in variables a, b and c.


Essentially the new data set should have variables id, v1 - v6.
v1 represents the number of times code 1 occurs in variables a, b and c

in the original data set, and so on for v2 - v6.

The data set would like;
1 1 3 2 2 3 1
2 2 2 2 2 1 1

....

Given your source data, neither your 1st or 2nd result rows are consistent
with counting only fields a-c. The 1st row is consistent with counting
fields a-d, the 2nd isn't consistent with any set of fields since the 2nd
through 7th columns total 10, which isn't a multiple of 3 (the number of
rows in your source data with ID=2). If you meant to count fields a-d, your
second row should be

2 2 2 2 2 3 1

You could do this entirely with formulas, or a pivot table if you add field
names. As for formulas, if your source data was in a range named TBL, and
the top-left result cell were G5, try

G5:
=INDEX(TBL,1,1)

G6 [array formula]:
=INDEX(TBL,MATCH(0,COUNTIF(G$5:G5,INDEX(TBL,0,1)), 0),1)

H5:M5 [array formula]:
=MMULT(COLUMN(INDIRECT("RC1:RC"&COUNTIF(INDEX(TBL, 0,1),$G5),0))^0,
COUNTIF(OFFSET(TBL,SMALL(IF(INDEX(TBL,0,1)=$G5,ROW (TBL)),
ROW(INDIRECT("1:"&COUNTIF(INDEX(TBL,0,1),$G5))))-ROW(INDEX(TBL,1,1)),1,1,
COLUMNS(TBL)-1),{1,2,3,4,5,6}))

Select H5:M5 and fill down into H6:M6. Then select G6:M6 and fill down as
needed.

An alternative VBA approach that ignores values in fields a-d that aren't
integers between 1 and 6 would be


Sub foo()
'requires reference to Microsoft Scripting Runtime

Dim dk As New Dictionary, dv As New Dictionary
Dim sv As Variant, rv As Variant
Dim i As Long, j As Long, k As Long, x As Variant

k = 0
'modify next line as needed
For Each x In Array(1, 2, 3, 4, 5, 6)
k = k + 1
dv.Add Key:=x, Item:=k
Next x

k = 16
ReDim rv(0 To dv.Count, 1 To k)

sv = Range("A1").CurrentRegion.Value

For i = 1 To UBound(sv, 1)
If Not dk.Exists(sv(i, 1)) Then
dk.Add Key:=sv(i, 1), Item:=dk.Count + 1
rv(0, dk.Count) = sv(i, 1)
End If

If dk.Count = k Then
k = 2 * k
ReDim Preserve rv(0 To dv.Count, 1 To k)
End If

For j = 2 To UBound(sv, 2)
If dv.Exists(sv(i, j)) Then _
rv(dv(sv(i, j)), dk(sv(i, 1))) = rv(dv(sv(i, j)), dk(sv(i, 1))) + 1
Next j
Next i

ReDim Preserve rv(0 To dv.Count, 1 To dk.Count)

ActiveCell.Resize(dk.Count, dv.Count + 1).Value = _
Application.WorksheetFunction.Transpose(rv)

End Sub




All times are GMT +1. The time now is 01:14 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com