Jump to content
Nich...

PowerShell help

Recommended Posts

I'm trying to write an algorithm that reads in two CSV files to a table, loops through each entry in one looking for an identical value in the second, and either updating some data in the table or writing it all to a new table. When the loop is complete, I'll write it all back out to another CSV file.


The reading and writing of the files I'm ok with, having learnt the syntax for a previous iteration of this project.

I think I'd be using a nested foreach loop, with the outer loop pulling the reference value and the inner loop parsing to see if there's a match - if true, update data I'm looking for, if false, write a null value or something.

Is this even close?

Accepting help on getting correct algorithm or pseudocode, but will also accept code snippets if you're good with PowerShell.

 

Share this post


Link to post
Share on other sites

If the files were both pre sorted by the value of the field in question then the number of iterations could be drastically slashed.
But of course the sort operation in itself is a bunch of work that has to be done sometime.

And you don't necessarily want your data reordered in such a fashion - though you could add a field containing original record # then just sort by that in the final stage.

Share this post


Link to post
Share on other sites

The datasets are alphabetically ordered - or can be, quite easily.

Unsure if I just spent too many hours playing with spreadsheets, but I guess I could look at the first letter of the value being parsed, and only match against values from the other dataset starting with the same character.

I remember the overheads involved in eg bubble sorting vs other types of sorting, but I'm not sure if the pattern matching in powershell is smart enough to skim the first character, and only move onto the second if it's the same, and so on, rather than matching against every single entry in the array/table.

Share this post


Link to post
Share on other sites

For string comparisons you may as well just straight up compare them, the comparison will quit at the first mismatched character which should be more efficient than you pre-parsing in that way.

For the sorting - for sure there must be some free, easy to call packages out there that you could just use rather than having to reinvent the wheel.

Share this post


Link to post
Share on other sites

For what it’s worth, Sort-Object’s expression system with a regex will indeed stop on the first letter/token that will make it not match (modulo back references and other bullshit)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×