Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
Advice please-best way to approach this, searching for names in a sea of files
Mr Dump
post Mar 18 2012, 10:11 AM
Post #1
Atomican
Charge




I was wondering what would be the best way to automate this task I regularly do- Zzozzach's scripting question sort of looks along the lines but ... well, im new to this....
anyway heres what I do :

get a text list of variable names

search (I just cut/paste each lineone at a time into the windows explorer search bar - the actual search is surprisingly quick- i believe it is an indexed search)
for instance/s of that name in files in particular directories, the files of interest are *.c *.h *.doc *.pdf

make note of the hits

repeat.

I sometimes have 60 or so variable names, and it is a very slow process, which i'd love to automate.
The directories have about 4000 .c, .h files, and there are about 200 procedures .pdf,.doc.
It would be so good to have a text output that I can sort through, so as to match up what variables are set by what procedure and which c modules I need to grab to look in.

Is powershell scripting a suitable way to do this?
Is there a way to get a variable passed to the explorer search?

Laptop is W7, I have admin rights, btw.

Any pointers appreciated!
Go to the top of the page
 
+Quote Post
Girvo
post Mar 19 2012, 09:19 AM
Post #2
Hero
Titan




If we were talking Bash, I could help :P

As it stands, this might point you in the right direction: http://subjunctive.us/2008/04/01/powershell-file-search/


--------------------
brains: NO U R RONG N00B VISTA IS 4 N00BZ N LINUX IS HIPPIE SHIT
Go to the top of the page
 
+Quote Post
SledgY
post Mar 19 2012, 12:06 PM
Post #3
Atomican
Master




Good old windows batch files can do the text files:

Batch file (find_vars.bat):
CODE
@ECHO OFF
FOR /F %%G IN (vars.txt) DO FINDSTR /S /N /C:"%%G" *.c *.h *.py


And of course vars.txt looks like:
CODE
variable_1
variable_2


DOC files probably won't work, PDF technically should (most postscript is just text) but I can't guarantee it.

/edit I had *.py in there for my testing across a python project.
Also have you looked into search and replace tools?

This post has been edited by SledgY: Mar 19 2012, 12:07 PM


--------------------
poweredbypenguins.org - SledgY lives in the cloud...
Go to the top of the page
 
+Quote Post
Mr Dump
post Mar 19 2012, 09:08 PM
Post #4
Atomican
Charge




thanks guys, I just found the select-string term for powershell, after a days manual reading, and then the dos findstr popped up ha ha! - I am more comfortable with batch files so I'll try follow that path first! Thanks sledgy I'll muck around with this, get my head around the parameters and see how I go! (will report in a day or three, depending on work madness!)
stoked! thanks again!
Go to the top of the page
 
+Quote Post
SledgY
post Mar 20 2012, 08:46 AM
Post #5
Atomican
Master




Can help with an explaination:
CODE
FOR /F %%G IN (vars.txt) DO FINDSTR /S /N /C:"%%G" *.c *.h *.py


FOR
/F - loop over a text file and populate the variable %%G with each line

FINDSTR
/S - search through sub directories
/N - prints the line number prior to each match (added as it's kind of useful information!)
/C - tells findstr to do a literal search using the supplied value, in this case the value stored in the %%G variable


--------------------
poweredbypenguins.org - SledgY lives in the cloud...
Go to the top of the page
 
+Quote Post
Mr Dump
post Mar 25 2012, 06:51 AM
Post #6
Atomican
Charge




Hi there, the findstr works pretty well, however I am having issues getting a redirect to a file working, e.g. >list.txt but I workaournd by using the select all / copy text option of the cmd window. Also not successful with parsing *.pdf as yet, but that has been an issue in the explorer search bar too, so I am wondering if that stems from the windows indexing function...
Anyway, i'm getting output, so i'm stoked! thanks
Go to the top of the page
 
+Quote Post
SledgY
post Mar 26 2012, 06:33 PM
Post #7
Atomican
Master




There are tools that will let you search inside of a PDF.

One option is to use a tool like pdftohtml (from the poppler tool suite, there is a windows version) to extract text and from that do a search.

Alternatively the PyPDF library is a python library that can be used to manipulate and extract text from a PDF file.


--------------------
poweredbypenguins.org - SledgY lives in the cloud...
Go to the top of the page
 
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



Lo-Fi Version Time is now: 1st August 2014 - 07:38 AM