Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.misc
Joe Joe is offline
external usenet poster
 
Posts: 476
Default Elimination of Duplicates in an Array and its reduction.

I have two large columns arrays A1:A4000 and B1:B4000
A: contains words alphabetically sorted and B: their corresponding page
numbers.
I need to reduce all duplicates of a word to one row with all their number
occurences on adjacent column cells
of the same row.
For example
A B C D E F
word A 21 word A 21 25
word A 25 to word B 10 15 17
word B 10 word C 3
word B 15
word B 17
word C 3
How do I do this?

  #2   Report Post  
Posted to microsoft.public.excel.misc
Joe Joe is offline
external usenet poster
 
Posts: 476
Default Elimination of Duplicates in an Array and its reduction.

The formating in my example in this post got mangled!
I'll try again.
A B
WordA 21
WordA 25
WordB 10
WordB 15
WordB 17
WordC 3

Down to

C D E F
WordA 21 25
WordB 10 15 17
WordC 3



"Joe" wrote:

I have two large columns arrays A1:A4000 and B1:B4000
A: contains words alphabetically sorted and B: their corresponding page
numbers.
I need to reduce all duplicates of a word to one row with all their number
occurences on adjacent column cells
of the same row.
For example
A B C D E F
word A 21 word A 21 25
word A 25 to word B 10 15 17
word B 10 word C 3
word B 15
word B 17
word C 3
How do I do this?

  #3   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 15,768
Default Elimination of Duplicates in an Array and its reduction.

Here's one way.

Use the Advanced filter to extract the uniques from column A to another
location:

Select cell A1
Goto the menu DataFilterAdvanced filter
Select: Copy to another location
List range: $A$1:$A$4000
Copy to: $F$1
Unique records only
OK

Enter this array** formula in G1:

=IF(COLUMNS($A:A)<=COUNTIF($A$1:$A$4000,$F1),SMALL (IF($A$1:$A$4000=$F1,$B$1:$B$4000),COLUMNS($A:A)), "")

You'll have to copy across to a number of cells that is equal to the maximum
count of any single word from column A. In your posted example WordB
appeared the most times, 3. So you'll have to copy the formula across to a
total of 3 columns. Then copy down to the end of the list in column F.

** array formulas need to be entered using the key combination of
CTRL,SHIFT,ENTER (not just ENTER)

Biff

"Joe" wrote in message
...
I have two large columns arrays A1:A4000 and B1:B4000
A: contains words alphabetically sorted and B: their corresponding page
numbers.
I need to reduce all duplicates of a word to one row with all their
number
occurences on adjacent column cells
of the same row.
For example
A B C D E F
word A 21 word A 21 25
word A 25 to word B 10 15 17
word B 10 word C 3
word B 15
word B 17
word C 3
How do I do this?



  #4   Report Post  
Posted to microsoft.public.excel.misc
Joe Joe is offline
external usenet poster
 
Posts: 476
Default Elimination of Duplicates in an Array and its reduction.

The equation works well.
I noticed that when using DataFilterAdvanced filter after selecting A1
then "Copy to another location" and "Unique records only" ticked
It would produce a duplicate of the first two rows if they were actually
duplicates as in my example. The list range was $A$1:$A$6
Is this a bug in excel or is it something to do with the array not having a
labels row?

"T. Valko" wrote:

Here's one way.

Use the Advanced filter to extract the uniques from column A to another
location:

Select cell A1
Goto the menu DataFilterAdvanced filter
Select: Copy to another location
List range: $A$1:$A$4000
Copy to: $F$1
Unique records only
OK

Enter this array** formula in G1:

=IF(COLUMNS($A:A)<=COUNTIF($A$1:$A$4000,$F1),SMALL (IF($A$1:$A$4000=$F1,$B$1:$B$4000),COLUMNS($A:A)), "")

You'll have to copy across to a number of cells that is equal to the maximum
count of any single word from column A. In your posted example WordB
appeared the most times, 3. So you'll have to copy the formula across to a
total of 3 columns. Then copy down to the end of the list in column F.

** array formulas need to be entered using the key combination of
CTRL,SHIFT,ENTER (not just ENTER)

Biff

"Joe" wrote in message
...
I have two large columns arrays A1:A4000 and B1:B4000
A: contains words alphabetically sorted and B: their corresponding page
numbers.
I need to reduce all duplicates of a word to one row with all their
number
occurences on adjacent column cells
of the same row.
For example
A B C D E F
word A 21 word A 21 25
word A 25 to word B 10 15 17
word B 10 word C 3
word B 15
word B 17
word C 3
How do I do this?




  #5   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 15,768
Default Elimination of Duplicates in an Array and its reduction.

Is this a bug in excel or is it something to do with the
array not having a labels row?


Not a bug. The filter "assumes" the first entry is a column label (header)
and brings that entry with it. So, if you had:

WordA
WordA

The filter assumes the first WordA is the column header and the second WordA
is a unique entry. So it extracts both.

Usually when you don't have a "distinctive" header Excel will "complain" and
ask you if your data has a header and how you want to handle it.

Biff

"Joe" wrote in message
...
The equation works well.
I noticed that when using DataFilterAdvanced filter after selecting A1
then "Copy to another location" and "Unique records only" ticked
It would produce a duplicate of the first two rows if they were actually
duplicates as in my example. The list range was $A$1:$A$6
Is this a bug in excel or is it something to do with the array not having
a
labels row?

"T. Valko" wrote:

Here's one way.

Use the Advanced filter to extract the uniques from column A to another
location:

Select cell A1
Goto the menu DataFilterAdvanced filter
Select: Copy to another location
List range: $A$1:$A$4000
Copy to: $F$1
Unique records only
OK

Enter this array** formula in G1:

=IF(COLUMNS($A:A)<=COUNTIF($A$1:$A$4000,$F1),SMALL (IF($A$1:$A$4000=$F1,$B$1:$B$4000),COLUMNS($A:A)), "")

You'll have to copy across to a number of cells that is equal to the
maximum
count of any single word from column A. In your posted example WordB
appeared the most times, 3. So you'll have to copy the formula across to
a
total of 3 columns. Then copy down to the end of the list in column F.

** array formulas need to be entered using the key combination of
CTRL,SHIFT,ENTER (not just ENTER)

Biff

"Joe" wrote in message
...
I have two large columns arrays A1:A4000 and B1:B4000
A: contains words alphabetically sorted and B: their corresponding
page
numbers.
I need to reduce all duplicates of a word to one row with all their
number
occurences on adjacent column cells
of the same row.
For example
A B C D E F
word A 21 word A 21 25
word A 25 to word B 10 15 17
word B 10 word C 3
word B 15
word B 17
word C 3
How do I do this?








  #6   Report Post  
Posted to microsoft.public.excel.misc
Joe Joe is offline
external usenet poster
 
Posts: 476
Default Elimination of Duplicates in an Array and its reduction.

The other thing a forgot to mention is that when pasting your formula into
the large array (20X4000) it takes literally many minutes to load and
calculate and can hang the program if it runs out of resources (memory?). I
have 2GB of physical memory. There are no other applications running at the
same time apart from the usual background ones.

"T. Valko" wrote:

Is this a bug in excel or is it something to do with the
array not having a labels row?


Not a bug. The filter "assumes" the first entry is a column label (header)
and brings that entry with it. So, if you had:

WordA
WordA

The filter assumes the first WordA is the column header and the second WordA
is a unique entry. So it extracts both.

Usually when you don't have a "distinctive" header Excel will "complain" and
ask you if your data has a header and how you want to handle it.

Biff

"Joe" wrote in message
...
The equation works well.
I noticed that when using DataFilterAdvanced filter after selecting A1
then "Copy to another location" and "Unique records only" ticked
It would produce a duplicate of the first two rows if they were actually
duplicates as in my example. The list range was $A$1:$A$6
Is this a bug in excel or is it something to do with the array not having
a
labels row?

"T. Valko" wrote:

Here's one way.

Use the Advanced filter to extract the uniques from column A to another
location:

Select cell A1
Goto the menu DataFilterAdvanced filter
Select: Copy to another location
List range: $A$1:$A$4000
Copy to: $F$1
Unique records only
OK

Enter this array** formula in G1:

=IF(COLUMNS($A:A)<=COUNTIF($A$1:$A$4000,$F1),SMALL (IF($A$1:$A$4000=$F1,$B$1:$B$4000),COLUMNS($A:A)), "")

You'll have to copy across to a number of cells that is equal to the
maximum
count of any single word from column A. In your posted example WordB
appeared the most times, 3. So you'll have to copy the formula across to
a
total of 3 columns. Then copy down to the end of the list in column F.

** array formulas need to be entered using the key combination of
CTRL,SHIFT,ENTER (not just ENTER)

Biff

"Joe" wrote in message
...
I have two large columns arrays A1:A4000 and B1:B4000
A: contains words alphabetically sorted and B: their corresponding
page
numbers.
I need to reduce all duplicates of a word to one row with all their
number
occurences on adjacent column cells
of the same row.
For example
A B C D E F
word A 21 word A 21 25
word A 25 to word B 10 15 17
word B 10 word C 3
word B 15
word B 17
word C 3
How do I do this?







  #7   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 15,768
Default Elimination of Duplicates in an Array and its reduction.

That's a pretty large range of array formulas.

I thought you had 4000 rows that included dupe words so there shouldn't be
4000 unique word entries. Or, is that possible?

Anyhow, you can try a couple of different things which are *slightly* better
than what you have now. I didn't test this on a dataset the same size as
yours so I can't say you'll get *significantly* better performance but these
formulas will calculate *slightly* faster.

Assume the uniques have been extracted to column F starting in F1.

Use column G as a helper column to get the counts of the words. This will
eliminate the 1000's of COUNTIFs from the array formula:

Enter in G1 and copy down as needed:

=COUNTIF(A$1:A$4000,F1)

Change the array formula (now entered in H1) to:

=IF(COLUMNS($A:A)<=$G1,INDEX($B$1:$B$4000,SMALL(IF ($A$1:$A$4000=$F1,ROW(B$1:B$4000)),COLUMNS($A:A))) ,"")

Now, this is assuming that your actual range of data is really where you
said it was, A1:B4000. If the actual range does not start on row 1 then we
have to calculate for the offset and doing that actually makes things worse.

If this is a one time process, after you get the data extracted you can
convert the formulas to constants.

Select the range of formulas
Goto EditCopy
Then EditPaste SpecialValuesOK

If this is a repetitive process a pivot table may be a better solution. I
can't really help with a pt because I hate them and rarely ever use them.
Pivot tables aren't dynamic either.

Biff

"Joe" wrote in message
...
The other thing a forgot to mention is that when pasting your formula into
the large array (20X4000) it takes literally many minutes to load and
calculate and can hang the program if it runs out of resources (memory?).
I
have 2GB of physical memory. There are no other applications running at
the
same time apart from the usual background ones.

"T. Valko" wrote:

Is this a bug in excel or is it something to do with the
array not having a labels row?


Not a bug. The filter "assumes" the first entry is a column label
(header)
and brings that entry with it. So, if you had:

WordA
WordA

The filter assumes the first WordA is the column header and the second
WordA
is a unique entry. So it extracts both.

Usually when you don't have a "distinctive" header Excel will "complain"
and
ask you if your data has a header and how you want to handle it.

Biff

"Joe" wrote in message
...
The equation works well.
I noticed that when using DataFilterAdvanced filter after selecting
A1
then "Copy to another location" and "Unique records only" ticked
It would produce a duplicate of the first two rows if they were
actually
duplicates as in my example. The list range was $A$1:$A$6
Is this a bug in excel or is it something to do with the array not
having
a
labels row?

"T. Valko" wrote:

Here's one way.

Use the Advanced filter to extract the uniques from column A to
another
location:

Select cell A1
Goto the menu DataFilterAdvanced filter
Select: Copy to another location
List range: $A$1:$A$4000
Copy to: $F$1
Unique records only
OK

Enter this array** formula in G1:

=IF(COLUMNS($A:A)<=COUNTIF($A$1:$A$4000,$F1),SMALL (IF($A$1:$A$4000=$F1,$B$1:$B$4000),COLUMNS($A:A)), "")

You'll have to copy across to a number of cells that is equal to the
maximum
count of any single word from column A. In your posted example WordB
appeared the most times, 3. So you'll have to copy the formula across
to
a
total of 3 columns. Then copy down to the end of the list in column F.

** array formulas need to be entered using the key combination of
CTRL,SHIFT,ENTER (not just ENTER)

Biff

"Joe" wrote in message
...
I have two large columns arrays A1:A4000 and B1:B4000
A: contains words alphabetically sorted and B: their corresponding
page
numbers.
I need to reduce all duplicates of a word to one row with all their
number
occurences on adjacent column cells
of the same row.
For example
A B C D E F
word A 21 word A 21 25
word A 25 to word B 10 15 17
word B 10 word C 3
word B 15
word B 17
word C 3
How do I do this?









  #8   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 947
Default Elimination of Duplicates in an Array and its reduction.

the large array (20X4000) it takes literally many minutes to load...

Hi. Just something different.
In Excel 2007, you can probably make a Pivot table with all unique words
down, and all page numbers across.
(Most likely exceeding 256 Columns)
All a macro would have to do is shift all the page numbers over to the left
in place of all the blanks.
Not tested, but I don't see it taking more than 2-3 seconds.

--
HTH :)
Dana DeLouis
Windows XP & Office 2007


"Joe" wrote in message
...
The other thing a forgot to mention is that when pasting your formula into
the large array (20X4000) it takes literally many minutes to load and
calculate and can hang the program if it runs out of resources (memory?).
I
have 2GB of physical memory. There are no other applications running at
the
same time apart from the usual background ones.

"T. Valko" wrote:

Is this a bug in excel or is it something to do with the
array not having a labels row?


Not a bug. The filter "assumes" the first entry is a column label
(header)
and brings that entry with it. So, if you had:

WordA
WordA

The filter assumes the first WordA is the column header and the second
WordA
is a unique entry. So it extracts both.

Usually when you don't have a "distinctive" header Excel will "complain"
and
ask you if your data has a header and how you want to handle it.

Biff

"Joe" wrote in message
...
The equation works well.
I noticed that when using DataFilterAdvanced filter after selecting
A1
then "Copy to another location" and "Unique records only" ticked
It would produce a duplicate of the first two rows if they were
actually
duplicates as in my example. The list range was $A$1:$A$6
Is this a bug in excel or is it something to do with the array not
having
a
labels row?

"T. Valko" wrote:

Here's one way.

Use the Advanced filter to extract the uniques from column A to
another
location:

Select cell A1
Goto the menu DataFilterAdvanced filter
Select: Copy to another location
List range: $A$1:$A$4000
Copy to: $F$1
Unique records only
OK

Enter this array** formula in G1:

=IF(COLUMNS($A:A)<=COUNTIF($A$1:$A$4000,$F1),SMALL (IF($A$1:$A$4000=$F1,$B$1:$B$4000),COLUMNS($A:A)), "")

You'll have to copy across to a number of cells that is equal to the
maximum
count of any single word from column A. In your posted example WordB
appeared the most times, 3. So you'll have to copy the formula across
to
a
total of 3 columns. Then copy down to the end of the list in column F.

** array formulas need to be entered using the key combination of
CTRL,SHIFT,ENTER (not just ENTER)

Biff

"Joe" wrote in message
...
I have two large columns arrays A1:A4000 and B1:B4000
A: contains words alphabetically sorted and B: their corresponding
page
numbers.
I need to reduce all duplicates of a word to one row with all their
number
occurences on adjacent column cells
of the same row.
For example
A B C D E F
word A 21 word A 21 25
word A 25 to word B 10 15 17
word B 10 word C 3
word B 15
word B 17
word C 3
How do I do this?









Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Array Offset() formula with height of 1 returns duplicates? Uhl Excel Worksheet Functions 9 December 13th 06 01:22 AM
Gauss elimination Anne Excel Worksheet Functions 2 September 11th 06 12:58 PM
array formula with duplicates JR573PUTT Excel Discussion (Misc queries) 0 February 16th 06 09:18 PM
Mathematical elimination da_big_k Excel Worksheet Functions 2 January 11th 06 06:22 PM
Gaussian Elimination cjrrussell Excel Worksheet Functions 11 December 23rd 05 08:59 AM


All times are GMT +1. The time now is 05:52 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"