Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 11
Default Comparing contents - is it possible

I receive two files at the end of each month.

One called "Tracking" has a list of all changes to all invoices (some
invoices have actually been invoiced and others are still in
progress), the other just has the invoices called "Invoices" (just the
invoices that have been finalized). These are sent automatically at
the end of the month from an SQL server.

For some reason the invoices file is now coming with duplicates in it.
I have asked the people who send this to me to remove the duplicates
but they say they can't change the SQL on their end. (I thought SQL
is supposed to be flexible enough to make changes to queries?????). I
thought I could just delete the duplicate invoice numbers, but found
out that the values are not equal in some cases so I have to go to the
"Tracking" spreadsheet to find the most recent change and keep that
copy of the Invoice.

The question:
Is it possible to compare the most recent change in the "Tracking"
spreadsheet to the "Invoices" spreadsheet, then delete the duplicate
records in the "Invoices" spreadsheet.

Currently, I delete all the extra changes in "Tracking" so that I am
only seeing the most recent of Invoice changes. Then I am going
Invoice Number by Invoice Number and comparing values to the
"Invoices". Once I have no duplicates in the "Invoices" file, it gets
uploaded into Access for further manipulation. The only reason I
noticed this is because in Access, some of the Invoice values were
doubled because the invoices were "summed" during the upload phase in
Access.

Any suggestions would be great.

Thanks, Lynn
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 27,285
Default Comparing contents - is it possible

I am sure you can do that, but you haven't given enough clear information to
give any specific advice. You can check out Chip Pearson's pages on dealing
with duplicates:

http://www.cpearson.com/excel/duplicat.htm


more important, however, it the fact that you are getting flawed data. If
they can't change the SQL, then you need to find out who can. How do you
know the information you are getting is even correct? You should not have
to search very far to find someone who will both be interested in the
problem and have the power to get it corrected. You are trying to mask the
symptoms, but someone needs to treat the disease.

--
Regards,
Tom Ogilvy

"Lynn A." wrote in message
om...
I receive two files at the end of each month.

One called "Tracking" has a list of all changes to all invoices (some
invoices have actually been invoiced and others are still in
progress), the other just has the invoices called "Invoices" (just the
invoices that have been finalized). These are sent automatically at
the end of the month from an SQL server.

For some reason the invoices file is now coming with duplicates in it.
I have asked the people who send this to me to remove the duplicates
but they say they can't change the SQL on their end. (I thought SQL
is supposed to be flexible enough to make changes to queries?????). I
thought I could just delete the duplicate invoice numbers, but found
out that the values are not equal in some cases so I have to go to the
"Tracking" spreadsheet to find the most recent change and keep that
copy of the Invoice.

The question:
Is it possible to compare the most recent change in the "Tracking"
spreadsheet to the "Invoices" spreadsheet, then delete the duplicate
records in the "Invoices" spreadsheet.

Currently, I delete all the extra changes in "Tracking" so that I am
only seeing the most recent of Invoice changes. Then I am going
Invoice Number by Invoice Number and comparing values to the
"Invoices". Once I have no duplicates in the "Invoices" file, it gets
uploaded into Access for further manipulation. The only reason I
noticed this is because in Access, some of the Invoice values were
doubled because the invoices were "summed" during the upload phase in
Access.

Any suggestions would be great.

Thanks, Lynn



  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 593
Default Comparing contents - is it possible

"Tom Ogilvy" wrote in ...

more important, however, it the fact that you are getting flawed data. If
they can't change the SQL, then you need to find out who can.


The OP saying, 'they can't change the SQL on their end' may mean,
'they lack the knowledge required to change the SQL on their end'.
I've been looking at some code this week where sql was used to get the
whole (ordered) data set then VBA was used to loop through the rows to
copy the required data to an array. My approach would be to write sql
to fetch the data set as required in one hit but it would need a quite
complex query. I'm up for the challenge (when time allows <g) but
perhaps the OP's situation is the sql guy couldn't come with the goods
so said, 'Here *all* the data, filter it in Excel.'

The question:
Is it possible to compare the most recent change in the "Tracking"
spreadsheet to the "Invoices" spreadsheet, then delete the duplicate
records in the "Invoices" spreadsheet.


No doubt. Using sql on the Excel data, one can't DELETE rows but one
can create a new data set (e.g. a new worksheet) by making an INNER
JOIN between the two sheets using the common key and using either a
subquery or a GROUP BY (using MAX(<<date_column) as the criteria) to
get the most recent rows. As Tom said, you need to post more details.

Jamie.

--
  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 11
Default Comparing contents - is it possible

Thanks for your quick responses. I hope this is enough detail to my
issue.

I get two files from the server e-mailed monthly.

I delete all but the most recent transaction record for each invoice
in the "Tracking" workbook. This leaves me with the final changes
that occured for each Invoice. This does not have any billing amounts
on it but it does have the last date of changes on it.

I sort it by invoice number.

Next I split my screen and look at each line in the "Invoices"
workbook. For any invoice that has duplicates, I look at the
"Tracking" workbook and I match the line that has the same invoice
status as the most recent change in the "Tracking" workbook. All the
billing information is in the "Invoices" workbook. Once clean, the
"Invoices" workbook is uploaded into Access to do budgeting analysis
in a third party product.

Unfortunately, all the columns aren't the same between the two
workbooks so I can't use the Compare workbooks that I found referenced
several times in the archives.

I have a little macro that identifies the duplicates in the "Invoices"
workbook so I'm only having to look through about 150 records of 5000.

Could I run a loop or something?

For each duplicated record in the "Invoices" workbook, goto the
"Tracking" workbook, lookup the invoice number, find the invoice
status value, match the invoice status value, delete the other
occurences of that invoice number.

Is this descriptive enough? The person who will be taking this
responsibility over, doesn't have much excel experience and I can see
them getting really frustrated with the process. I don't mind it, in
takes me about an hour or so to clean the data depending on how many
duplicates are there, it would be nice to click a button and have all
of the steps done and import the data immediately.

I have given up trying to get the source to fix the problem. Not very
friendly.

Anything you can suggest is greatly appreciated. I have about a week
to work this through before I turn it over to the other person.

Thanks, Lynn


(Jamie Collins) wrote in message . com...
"Tom Ogilvy" wrote in ...

more important, however, it the fact that you are getting flawed data. If
they can't change the SQL, then you need to find out who can.


The OP saying, 'they can't change the SQL on their end' may mean,
'they lack the knowledge required to change the SQL on their end'.
I've been looking at some code this week where sql was used to get the
whole (ordered) data set then VBA was used to loop through the rows to
copy the required data to an array. My approach would be to write sql
to fetch the data set as required in one hit but it would need a quite
complex query. I'm up for the challenge (when time allows <g) but
perhaps the OP's situation is the sql guy couldn't come with the goods
so said, 'Here *all* the data, filter it in Excel.'

The question:
Is it possible to compare the most recent change in the "Tracking"
spreadsheet to the "Invoices" spreadsheet, then delete the duplicate
records in the "Invoices" spreadsheet.


No doubt. Using sql on the Excel data, one can't DELETE rows but one
can create a new data set (e.g. a new worksheet) by making an INNER
JOIN between the two sheets using the common key and using either a
subquery or a GROUP BY (using MAX(<<date_column) as the criteria) to
get the most recent rows. As Tom said, you need to post more details.

Jamie.

--

  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 593
Default Comparing contents - is it possible

(Lynn A.) wrote ...

Could I run a loop or something?


The set-based sql language was invented so we can avoid using
cursors/loops to perform simple queries on data. You mention using a
database...

I hope this is enough detail to my issue.


....but you have not posted the column names, sample data or expected
results, so we are still guessing.

I get two files from the server e-mailed monthly.
I delete all but the most recent transaction record for each invoice
in the "Tracking" workbook.
For any invoice that has duplicates, I look at the
"Tracking" workbook and I match the line that has the same invoice
status as the most recent change in the "Tracking" workbook.
I sort it by invoice number.


Presumably you have columns such as transaction_date and
invoice_number that allow you to do this. You could write a query to
SELECT just the most recent transactions, using the invoice_number and
invoice_status common to both tables (you suggest they may have
different names)

e.g. using GROUP BY syntax:

SELECT IV.invoice_number, MAX(TR.transaction_date)
FROM Invoices IV INNER JOIN Tracking TR
ON IV.invoice_number = TR.invoice_number
AND IV.invoice_status = TR.invoice_status
GROUP BY IV.invoice_number
ORDER BY IV.invoice_number
;

e.g. using a sub query:

SELECT IV.invoice_number
FROM Invoices IV INNER JOIN Tracking TR
ON IV.invoice_number = TR.invoice_number
AND IV.invoice_status = TR.invoice_status
WHERE TR.transaction_date =
(
SELECT MAX(TR2.transaction_date)
FROM Tracking TR2
WHERE TR2.invoice_number = TR.invoice_number
)
ORDER BY IV.invoice_number
;

This new data set could be SELECTed in to a new workbook or, better
still, directly into to the target database because, like Excel, it is
a Jet data source e.g.

SELECT
<<column list
INTO
[Database=\\MyServer\MyFolder\MyJetDB.mdb;].[MyTable]
FROM
[Excel 8.0;Database=C:\MyWorkbook;].[Invoices$] AS IV
<<JOIN clauses etc

I have given up trying to get the source to fix the problem. Not very
friendly.


And not doing their job properly, by the sound of it. The above
queries may give them some ideas. Post back with some more details if
you would like to implement some sql queries yourself.

Jamie.

--


  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 11
Default Comparing contents - is it possible

Thank you Jamie, if changing the original SQL query was an option,
your suggestions would be great. Unfortunately, I have to work with
what is given to me. Two spreadsheets, both with duplicates (I know
they shouldn't be there but I cannot change this at the source).

Here is more detail.

Fields in the "Tracking" spreadsheet that are applicable (35 in total)

CREATEDATE - the date that the change occurred on the invoice
INVOICENUMBER
OLDCODE - each phase of the invoice has a code
NEWCODE - indicates the next phase of the invoice

Several changes are made throughout the month and are listed. When I
receive this file, I delete any duplicate invoice numbers based on the
most recent date so that this only contains one row per invoice
number.

Fields in the "Invoicing" spreadsheet that are applicable (27 fields
in total)

INVOICENUMBER
CODE - the New Code from the Tracking spreadsheet

I currently format the duplicated Invoice Numbers in the Invoicing
spreadsheet (with a little code), then I manually loop through each
invoice number that had duplicates. I manually delete the line(s)
that don't match the "newcode" from the Tracking spreadsheet is the
same as the "code" from the Invoicing spreadsheet. There are
sometimes up to 100 duplicates

I was hoping that I might be able to loop through any identified
duplicates in the Invoicing spreadsheet and compare the values in the
tracking spreadsheet and then if
"invoicing"!"code"<"tracking"!"newcode", then delete that row.

I really appreciate your time. In my mind, I think it can be done,
but I can't get my head around comparing the values in one spreadsheet
to the other then deleting the row in the invoices spreadsheet if the
invoice number from the tracking spreadsheet matches but the newcode
in the tracking spreadsheet doesn't match the "code" in the invoices
table.

If there are changes to an invoice throughout the month, there will be
an invoice number listed in the invoicing spreadsheet.


Thanks again for your help.

Lynn


(Jamie Collins) wrote in message om...
(Lynn A.) wrote ...

Could I run a loop or something?


The set-based sql language was invented so we can avoid using
cursors/loops to perform simple queries on data. You mention using a
database...

I hope this is enough detail to my issue.


...but you have not posted the column names, sample data or expected
results, so we are still guessing.

I get two files from the server e-mailed monthly.
I delete all but the most recent transaction record for each invoice
in the "Tracking" workbook.
For any invoice that has duplicates, I look at the
"Tracking" workbook and I match the line that has the same invoice
status as the most recent change in the "Tracking" workbook.
I sort it by invoice number.


Presumably you have columns such as transaction_date and
invoice_number that allow you to do this. You could write a query to
SELECT just the most recent transactions, using the invoice_number and
invoice_status common to both tables (you suggest they may have
different names)

e.g. using GROUP BY syntax:

SELECT IV.invoice_number, MAX(TR.transaction_date)
FROM Invoices IV INNER JOIN Tracking TR
ON IV.invoice_number = TR.invoice_number
AND IV.invoice_status = TR.invoice_status
GROUP BY IV.invoice_number
ORDER BY IV.invoice_number
;

e.g. using a sub query:

SELECT IV.invoice_number
FROM Invoices IV INNER JOIN Tracking TR
ON IV.invoice_number = TR.invoice_number
AND IV.invoice_status = TR.invoice_status
WHERE TR.transaction_date =
(
SELECT MAX(TR2.transaction_date)
FROM Tracking TR2
WHERE TR2.invoice_number = TR.invoice_number
)
ORDER BY IV.invoice_number
;

This new data set could be SELECTed in to a new workbook or, better
still, directly into to the target database because, like Excel, it is
a Jet data source e.g.

SELECT
<<column list
INTO
[Database=\\MyServer\MyFolder\MyJetDB.mdb;].[MyTable]
FROM
[Excel 8.0;Database=C:\MyWorkbook;].[Invoices$] AS IV
<<JOIN clauses etc

I have given up trying to get the source to fix the problem. Not very
friendly.


And not doing their job properly, by the sound of it. The above
queries may give them some ideas. Post back with some more details if
you would like to implement some sql queries yourself.

Jamie.

--

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Comparing cell contents via VB RajenRajput1 Excel Discussion (Misc queries) 5 June 29th 07 05:37 PM
Comparing Cell Contents in 2 or More Columns ConfusedNHouston Excel Discussion (Misc queries) 2 June 18th 07 03:08 PM
Comparing Field Contents Wayne Taylor Excel Discussion (Misc queries) 2 February 26th 07 10:40 AM
Comparing cell contents with different reference cells Martin B Excel Worksheet Functions 3 November 22nd 06 07:10 PM
Comparing Workbook contents SMC Excel Discussion (Misc queries) 1 January 5th 05 09:48 PM


All times are GMT +1. The time now is 06:27 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"