Saturday, April 07, 2007

Utility programs 2

Rather than retyping the code from the previous post, I'm just going back to the line with argv and proceed from there with some actual working code. This time I want to focus on getting the filenames that will be processed. Remember the test of argv was just to determine if any arguments were passed. Since this program doesn't take any, having something else appear on the command line indicates a lack of understanding so we printed some usage instructions. Now let's assume the user knows that if he types "python opp2txt.py" he will get every opp file in the directory processed. But how to build a list of file names? Today we'll just build the list and print the filenames to show that it worked. The magic we want is contained in the glob module, no I don't know why it is named that, but isn't that just about the coolest name around? Here is the code that could be inserted into opp2txt.py to grab filenames.

if len(sys.argv)==1:
....Filelist=glob.glob('*.opp')
....if Filelist==[]:
........print "There don't appear to be any .opp files in this directory"
........sys.exit(2)
....for File in Filelist:
........print os.path.splitext(File)
else: usage()


First let's look at glob. The module itself is fairly simple and can be found in the python module docs. The short story is that you can return filenames matching a pattern in a specified directory into a list of strings. Really the best way to illustrate is to try it. In your favorite python editor do the following.

import glob
glob.glob(r'\vr\hostdir\*.par')

This should return a list which contains the names of every .par file in that directory. Something like...
'\\vr\\hostdir\\drivefile.par', '\\vr\\hostdir\\editline.par',
The lower case r before the path name just tells python to treat the string that follows as a raw string and not to see the backslashes as special characters. You could also put in forward slashes without the 'r' since that is a valid path separator in python but the return list comes back as an odd mix of forward and backward slashes.

We are storing the returned list in a variable called Filelist. If there are no .opp files in the directory the list will be empty in which case we print a message to that effect and exit, I just picked a random exit code of 2 to indicate no files found.

Next the for loop will step through the list and process each name string in the list. The variable which represents the name string will be "File".

Meaningful processing will come next time, but for now we could just use the code "print File" to display each filename but let's take this opportunity to explore the extremely useful os.path functions. These are most helpful in separating components of full filenames into paths, extensions, base filenames, etc. In the program which will be run in a directory that includes the .opp files and will not contain a path component the list would look something like...
['1.opp', '2.opp']
and the splitext() will just return a name and an extension similar to...
('1', '.opp')
('2', '.opp')
which could be accessed individually using indexes
>>> print ('2', '.opp')[0]
2
>>> print ('2', '.opp')[1]
.opp
>>>
However to really have some fun lets go back to the previous example in the editor and try some examples with the name list returned from the vr hostdir directory. Here are some truncated examples of what would be returned. Take time to play with the os.path functions, there is some real usefulness there.

>>> for name in glob.glob(r'\vr\hostdir\*.par'):
... print os.path.basename(name)
...
drivefile.par
editline.par

or if we run it through splitext() after it has been stripped to basename.

>>> for name in glob.glob(r'\vr\hostdir\*.par'):
... print os.path.splitext(os.path.basename(name))
...
('drivefile', '.par')
('editline', '.par')

or how about just running splitext to get the filename so that you can create another file with the same name, but with a different extension.

>>> for name in glob.glob(r'\vr\hostdir\*.par'):
... print os.path.splitext(name)
...
('\\vr\\hostdir\\drivefile', '.par')
('\\vr\\hostdir\\editline', '.par')

or what about if you want to get the path name so you can create a file in the same directory.

>>> for name in glob.glob(r'\vr\hostdir\*.par'):
... print os.path.split(name)
...
('\\vr\\hostdir', 'drivefile.par')
('\\vr\\hostdir', 'editline.par')

or just test whether or not the file exists, of course all these do because we're just passing in existing file names, but you get the idea.

>>> for name in glob.glob(r'\vr\hostdir\*.par'):
... print os.path.exists(name)
...
True
True

Cool stuff, and we haven't even done anything useful yet, hopefully next time.



No comments:

For anyone interested in trying VrPython for the first time or if you are early in the game, I suggest going to the earliest posts and working forward. I use VrPython every day for many wonderful things, needless to say it will change and could potentially damage a file. Any risk associated with using VrPython or any code or scripts mentioned here lies solely with the end user.

The "Personal VrPython page" in the link section will contain many code examples and an organized table of contents to this blog in a fairly un-attractive (for now) form.