Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 4
Default Duplicate Difficulties

Hello All,
I am working a list with over 40,000 rows...
I need to see if there is a way that can delete duplicate rows...
Now to complicate things... Have a list that has acct numbers and times...
but i want delete duplicates that are in the same 5 mins. Make sense???
Please Help!!!
--
Still Learning...
  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 143
Default Duplicate Difficulties

Couple of questions:

How are your timestamps formated? Are they a traditional stamp, like
4/28/2009 10:00 PM, or are they something custom

When you that you need to delete entries in the 'same five minutes', is this
a static window of time? For example if the time is between 10:00:00 and
10:04:59, then are these the same five minutes? 10:05:00 to 10:09:59 would
be the next, etc.

Let me know.

--
Regards,

PJ
Please rate this post using the vote buttons if it was helpful.



":: Amy ::" wrote:

Hello All,
I am working a list with over 40,000 rows...
I need to see if there is a way that can delete duplicate rows...
Now to complicate things... Have a list that has acct numbers and times...
but i want delete duplicates that are in the same 5 mins. Make sense???
Please Help!!!
--
Still Learning...

  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 915
Default Duplicate Difficulties

:: Amy :: wrote:
Hello All,
I am working a list with over 40,000 rows...
I need to see if there is a way that can delete duplicate rows...
Now to complicate things... Have a list that has acct numbers and times...
but i want delete duplicates that are in the same 5 mins. Make sense???
Please Help!!!


Hello,

Consider this example in A1:C15

Acct Time Dups
J 17 0
E 1 1
K 3 0
B 18 0
K 19 1
J 7 0
E 6 2
G 4 0
F 9 0
C 6 0
E 11 2
K 16 1
E 16 1
B 12 0

Columns A and B are random data. The formula in C2 and filled down is:
=SUMPRODUCT((A2=$A$2:$A$15)*(ABS(B2-$B$2:$B$15)<=5))-1

A "Dups" value greater than 0 means the Acct has at least one other row
matching Acct with a Time within 5 units.

(I am using whole "Time" numbers to simplify the explanation. Since you
are probably working with a date/time-type field you will need to adjust
the comparison
<=5
to something like
<=(5/24/60)
in order to convert to Excel's date/time numbering.)

But before you sort on Dups and delete 0 I would draw your attention
to the special case of Acct "E".

Acct "E" has times of 1,6,11 and 16; each is within 5 units of some
other. However, the total Time span of this Acct is 15. Do all these
count as duplicates?
  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 4
Default Duplicate Difficulties

ok... here is an example of the info that i am working w/
Unfortunately when I export this report to excel the time stamp is like a
text... :(
I can fix that however, we are looking at about 40,000 rows :(

Employee Account # Date Time
John Doe 83773011 3/20/2009 4:46PM
John Doe 83773011 3/20/2009 4:27PM
John Doe 97868507 3/20/2009 4:26PM
John Doe 97868507 3/20/2009 4:25PM
John Doe 2433904 3/20/2009 4:19PM
John Doe 2433904 3/20/2009 4:18PM
John Doe 2433904 3/20/2009 4:18PM
John Doe 92396306 3/20/2009 4:15PM
John Doe 103451304 3/20/2009 4:08PM
John Doe 54459201 3/20/2009 3:56PM
John Doe 54459201 3/20/2009 3:52PM
John Doe 6101201 3/20/2009 3:39PM
John Doe 108010401 3/20/2009 3:24PM
John Doe 81673524 3/20/2009 3:15PM
Thank you sooooo much for your time... i appreciate any help that I can
get :)
--
Still Learning...


"smartin" wrote:

:: Amy :: wrote:
Hello All,
I am working a list with over 40,000 rows...
I need to see if there is a way that can delete duplicate rows...
Now to complicate things... Have a list that has acct numbers and times...
but i want delete duplicates that are in the same 5 mins. Make sense???
Please Help!!!


Hello,

Consider this example in A1:C15

Acct Time Dups
J 17 0
E 1 1
K 3 0
B 18 0
K 19 1
J 7 0
E 6 2
G 4 0
F 9 0
C 6 0
E 11 2
K 16 1
E 16 1
B 12 0

Columns A and B are random data. The formula in C2 and filled down is:
=SUMPRODUCT((A2=$A$2:$A$15)*(ABS(B2-$B$2:$B$15)<=5))-1

A "Dups" value greater than 0 means the Acct has at least one other row
matching Acct with a Time within 5 units.

(I am using whole "Time" numbers to simplify the explanation. Since you
are probably working with a date/time-type field you will need to adjust
the comparison
<=5
to something like
<=(5/24/60)
in order to convert to Excel's date/time numbering.)

But before you sort on Dups and delete 0 I would draw your attention
to the special case of Acct "E".

Acct "E" has times of 1,6,11 and 16; each is within 5 units of some
other. However, the total Time span of this Acct is 15. Do all these
count as duplicates?

  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 1,180
Default Duplicate Difficulties

Excel 2007
Text to Columns, MRound, Remove Duplicates
http://www.mediafire.com/file/igjn2mwewtm/04_29_09.xlsx


  #6   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 915
Default Duplicate Difficulties

No worries. Divide and conquer.

Your sample appears nicely in A1:D15.

I added in E2:
=VALUE(LEFT(D2,LEN(D2)-2))+0.5*(RIGHT(D2,2)="PM")
(Removes the "AM/PM" tail and converts to a number representing the time
of day. Format this column as Time to see for real.)

And in F2:
=C2+E2
(Adds the date to the time of day. Now we have a proper date/time value.)

Finally in G2:
=SUMPRODUCT((B2=$B$2:$B$15)*(ABS(F2-$F$2:$F$15)<=5/24/60))-1

Maybe one worry... the SUMPRODUCT will probably be very slow on 40k rows.


:: Amy :: wrote:
ok... here is an example of the info that i am working w/
Unfortunately when I export this report to excel the time stamp is like a
text... :(
I can fix that however, we are looking at about 40,000 rows :(

Employee Account # Date Time
John Doe 83773011 3/20/2009 4:46PM
John Doe 83773011 3/20/2009 4:27PM
John Doe 97868507 3/20/2009 4:26PM
John Doe 97868507 3/20/2009 4:25PM
John Doe 2433904 3/20/2009 4:19PM
John Doe 2433904 3/20/2009 4:18PM
John Doe 2433904 3/20/2009 4:18PM
John Doe 92396306 3/20/2009 4:15PM
John Doe 103451304 3/20/2009 4:08PM
John Doe 54459201 3/20/2009 3:56PM
John Doe 54459201 3/20/2009 3:52PM
John Doe 6101201 3/20/2009 3:39PM
John Doe 108010401 3/20/2009 3:24PM
John Doe 81673524 3/20/2009 3:15PM
Thank you sooooo much for your time... i appreciate any help that I can
get :)

  #7   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 140
Default Duplicate Difficulties

Are the Account#'s and the Time values in separate columns?

A B C D
John Doe 83773011 3/20/2009 4:46PM

If they are, I would do the following
1. Add a column on the end(right side) that contains the formula
=if(B1=B2,"Duplicate Account") note: lets say this is in cell E1 (are the
account numbers duplicated?)
2. Add another column on the end(right side) that contains the formula
=if(E1="Duplicate Account",D1-D2) note: lets say this is in cell F1 (if the
account numbers duplicated, what is the time difference?)
3. Add another column on the end(right side) that contains the formula
=if(F1<0:05,"Less Than 5 Minutes") note: lets say this is in cell G1 (if
the account numbers duplicated, is the time difference less than 5 minutes?)

Then use a macro to look through the G column and start erasing the bad
line(duplicate records under 5 minutes).

Dennis



":: Amy ::" <:: Amy wrote in message
...
ok... here is an example of the info that i am working w/
Unfortunately when I export this report to excel the time stamp is like a
text... :(
I can fix that however, we are looking at about 40,000 rows :(

Employee Account # Date Time
John Doe 83773011 3/20/2009 4:46PM
John Doe 83773011 3/20/2009 4:27PM
John Doe 97868507 3/20/2009 4:26PM
John Doe 97868507 3/20/2009 4:25PM
John Doe 2433904 3/20/2009 4:19PM
John Doe 2433904 3/20/2009 4:18PM
John Doe 2433904 3/20/2009 4:18PM
John Doe 92396306 3/20/2009 4:15PM
John Doe 103451304 3/20/2009 4:08PM
John Doe 54459201 3/20/2009 3:56PM
John Doe 54459201 3/20/2009 3:52PM
John Doe 6101201 3/20/2009 3:39PM
John Doe 108010401 3/20/2009 3:24PM
John Doe 81673524 3/20/2009 3:15PM
Thank you sooooo much for your time... i appreciate any help that I can
get :)
--
Still Learning...


"smartin" wrote:

:: Amy :: wrote:
Hello All,
I am working a list with over 40,000 rows...
I need to see if there is a way that can delete duplicate rows...
Now to complicate things... Have a list that has acct numbers and
times...
but i want delete duplicates that are in the same 5 mins. Make
sense???
Please Help!!!


Hello,

Consider this example in A1:C15

Acct Time Dups
J 17 0
E 1 1
K 3 0
B 18 0
K 19 1
J 7 0
E 6 2
G 4 0
F 9 0
C 6 0
E 11 2
K 16 1
E 16 1
B 12 0

Columns A and B are random data. The formula in C2 and filled down is:
=SUMPRODUCT((A2=$A$2:$A$15)*(ABS(B2-$B$2:$B$15)<=5))-1

A "Dups" value greater than 0 means the Acct has at least one other row
matching Acct with a Time within 5 units.

(I am using whole "Time" numbers to simplify the explanation. Since you
are probably working with a date/time-type field you will need to adjust
the comparison
<=5
to something like
<=(5/24/60)
in order to convert to Excel's date/time numbering.)

But before you sort on Dups and delete 0 I would draw your attention
to the special case of Acct "E".

Acct "E" has times of 1,6,11 and 16; each is within 5 units of some
other. However, the total Time span of this Acct is 15. Do all these
count as duplicates?

  #8   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 140
Default Duplicate Difficulties

Are the duplicates next to each other(row to row)?

Dennis


":: Amy ::" <:: Amy wrote in message
...
Hello All,
I am working a list with over 40,000 rows...
I need to see if there is a way that can delete duplicate rows...
Now to complicate things... Have a list that has acct numbers and times...
but i want delete duplicates that are in the same 5 mins. Make sense???
Please Help!!!
--
Still Learning...


Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
if function difficulties [email protected] Excel Worksheet Functions 1 December 28th 07 07:15 PM
Keystroke Difficulties WOLLAM Excel Discussion (Misc queries) 3 July 22nd 06 07:10 AM
Difficulties with HYPERLINK mevetts Excel Discussion (Misc queries) 2 April 12th 06 08:13 PM
difficulties ernie Excel Discussion (Misc queries) 1 February 13th 06 08:39 PM
totaling difficulties alisonmacd Excel Worksheet Functions 1 June 22nd 05 10:24 AM


All times are GMT +1. The time now is 04:51 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"