ExcelBanter - find duplicates between rows, keep or delete entries based on ranked relevance

ExcelBanter

ExcelBanter (https://www.excelbanter.com/)

- Excel Programming (https://www.excelbanter.com/excel-programming/)

- - find duplicates between rows, keep or delete entries based on ranked relevance (https://www.excelbanter.com/excel-programming/365551-find-duplicates-between-rows-keep-delete-entries-based-ranked-relevance.html)

[email protected]

find duplicates between rows, keep or delete entries based on ranked relevance

Hello all,

I am a biologist, I use Excel often but am new to VBA. Essentially, I
am looking to create a macro which gleans certain data from one
worksheet and refines and deposits it into an adjacent worksheet. If
the answer to my problem is already out there, sorry, I searched and
did not find it--probably I don't know enough on the topic to search
efficiently.

I have data classifying behavior of marked organisms. For a given date,
we may have recorded behavioral data on a unique animal more than once.
On some days I have 6 rows of data for a particular animal (identified
with an alphanumeric code, such as T332), and on some days I have only
one observation of that individual. Some behaviors are more
"important" than others, so if animal #T332 was recorded 6 times on
a given day I only need to keep the one row with the behavior deemed
most useful to our research question. Ultimately, I want one row of
data per animal per day. There are many hundreds of uniquely identified
organisms.

I'm trying to create a macro that will:

1. Search the worksheet, finding occurrences (rows) where the same
animal was observed more than once on the same date. E.g. if the fields
under columns: 'Name', 'Month', 'Day' and 'Year' are =
between 2 or more rows, I want to go to step 2.

2. Consider the group of rows which met the above criteria, the row
having the most "important" behavior (column 'rbi') observed on
that day is retained. Other rows are discarded. rbi order of importance
(high priority to low) is: S, NP, WP, WF, OM, A, U.

3. The "winning" row is sent to an adjacent results worksheet.

When there aren't multiple rows per animal per date, the program
sends the lone entry to the results worksheet, and the program goes on
to search the entire worksheet.

I'm turning to this forum because I haven't found much discussion
or information (that I understand at least) online or in the books
which I could use to do this. Any ideas would be great, if possible,
please include as many translations from code language to English as
you can. Even just help on coding out the first step would be
incredibly helpful-i.e. how to find and select rows that have the
same values under certain columns.

Any Ideas? Thanks for taking the time to think about this!

Mark Driscol[_2_]

find duplicates between rows, keep or delete entries based on ranked relevance

Perhaps you could use an AutoFilter? Filter on a particular animal,
look for which behavior among the filtered rows is most important, and
copy that to another worksheet.

That solution doesn't use VBA or automate the process, but without
knowing more about your data and how it is organized in your worksheet
it might be difficult to craft a suitable reply.

Mark

wrote:
Hello all,

I am a biologist, I use Excel often but am new to VBA. Essentially, I
am looking to create a macro which gleans certain data from one
worksheet and refines and deposits it into an adjacent worksheet. If
the answer to my problem is already out there, sorry, I searched and
did not find it--probably I don't know enough on the topic to search
efficiently.

I have data classifying behavior of marked organisms. For a given date,
we may have recorded behavioral data on a unique animal more than once.
On some days I have 6 rows of data for a particular animal (identified
with an alphanumeric code, such as T332), and on some days I have only
one observation of that individual. Some behaviors are more
"important" than others, so if animal #T332 was recorded 6 times on
a given day I only need to keep the one row with the behavior deemed
most useful to our research question. Ultimately, I want one row of
data per animal per day. There are many hundreds of uniquely identified
organisms.

I'm trying to create a macro that will:

1. Search the worksheet, finding occurrences (rows) where the same
animal was observed more than once on the same date. E.g. if the fields
under columns: 'Name', 'Month', 'Day' and 'Year' are =
between 2 or more rows, I want to go to step 2.

2. Consider the group of rows which met the above criteria, the row
having the most "important" behavior (column 'rbi') observed on
that day is retained. Other rows are discarded. rbi order of importance
(high priority to low) is: S, NP, WP, WF, OM, A, U.

3. The "winning" row is sent to an adjacent results worksheet.

When there aren't multiple rows per animal per date, the program
sends the lone entry to the results worksheet, and the program goes on
to search the entire worksheet.

I'm turning to this forum because I haven't found much discussion
or information (that I understand at least) online or in the books
which I could use to do this. Any ideas would be great, if possible,
please include as many translations from code language to English as
you can. Even just help on coding out the first step would be
incredibly helpful-i.e. how to find and select rows that have the
same values under certain columns.

Any Ideas? Thanks for taking the time to think about this!

[email protected]

find duplicates between rows, keep or delete entries based on ranked relevance

Hi Mark,

Thanks for the note... it's a good idea (and essentially what we have
been doing), but I need to automate the process because there are many
hundreds of unique animals named, and many thousands of rows of
observations in the database. Do you know, is there a way to attach an
example of my file to the group? otherwise I could send an example
worksheet to anyone who is interested.

Thanks again,

Elias

Mark Driscol wrote:
Perhaps you could use an AutoFilter? Filter on a particular animal,
look for which behavior among the filtered rows is most important, and
copy that to another worksheet.

That solution doesn't use VBA or automate the process, but without
knowing more about your data and how it is organized in your worksheet
it might be difficult to craft a suitable reply.

Mark

wrote:
Hello all,

I am a biologist, I use Excel often but am new to VBA. Essentially, I
am looking to create a macro which gleans certain data from one
worksheet and refines and deposits it into an adjacent worksheet. If
the answer to my problem is already out there, sorry, I searched and
did not find it--probably I don't know enough on the topic to search
efficiently.

I have data classifying behavior of marked organisms. For a given date,
we may have recorded behavioral data on a unique animal more than once.
On some days I have 6 rows of data for a particular animal (identified
with an alphanumeric code, such as T332), and on some days I have only
one observation of that individual. Some behaviors are more
"important" than others, so if animal #T332 was recorded 6 times on
a given day I only need to keep the one row with the behavior deemed
most useful to our research question. Ultimately, I want one row of
data per animal per day. There are many hundreds of uniquely identified
organisms.

I'm trying to create a macro that will:

1. Search the worksheet, finding occurrences (rows) where the same
animal was observed more than once on the same date. E.g. if the fields
under columns: 'Name', 'Month', 'Day' and 'Year' are =
between 2 or more rows, I want to go to step 2.

2. Consider the group of rows which met the above criteria, the row
having the most "important" behavior (column 'rbi') observed on
that day is retained. Other rows are discarded. rbi order of importance
(high priority to low) is: S, NP, WP, WF, OM, A, U.

3. The "winning" row is sent to an adjacent results worksheet.

When there aren't multiple rows per animal per date, the program
sends the lone entry to the results worksheet, and the program goes on
to search the entire worksheet.

I'm turning to this forum because I haven't found much discussion
or information (that I understand at least) online or in the books
which I could use to do this. Any ideas would be great, if possible,
please include as many translations from code language to English as
you can. Even just help on coding out the first step would be
incredibly helpful-i.e. how to find and select rows that have the
same values under certain columns.

Any Ideas? Thanks for taking the time to think about this!

All times are GMT +1. The time now is 09:19 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com