JDT

 

John Dixon
Technology
Limited

 
Google

Document Conversion and File Processing Projects


Overview

The following samples are typical of the type of scripts we write and use on a weekly basis. Some scripts may take only a few minutes to write but can save perhaps one or two hours of manual work, while others might take one or two days to write, but can save several weeks of manual effort.

The idea with all the scripts is basically the same: take an HTML or text file as input, change it in some way (for example, automatically change some of the HTML code), and then produce an output file that contains the changes.


ASCII file conversion

This simple Perl script reformats the contents of a text file. This enables a text file, worked on while in a simple format (click here to see an example), to be reformatted when ready for publication, (click here to see an example), so that it can be integrated into a web site that has been specifically designed to display such files.

Click here to view the script. To keep things simple, we have hard coded filenames directly into the script. In reality, this would never happen because it would necessitate editing the script for every file you wanted to process. The script makes a copy of before.txt prior to reformatting it, just in case something goes wrong during the conversion process.

Click here to download a version of the script that will convert all the .txt files in the current folder on a hard disk. This is a more realistic version of the script as there are no filenames hard coded into the script.


Cleaning up HTML files

Click here to download a Perl script for cleaning up HTML files.

This Perl script, which is more complicated than the one described above, cleans up HTML files that have been generated by Adobe FrameMaker/Quadralay WebWorks Publisher.


Creating a "last updated" report for files on a web site

Click here to download a Perl script for generating a "last updated" report for a web site.

This Perl script scans through the HTML files in a web site looking for the date each file was created and last updated. This information is then presented in a report, which itself is an HTML file.


Finding hidden characters in text files

Click here to download a Perl script for finding hidden characters in text files.

This Perl script scans through all the text (.txt) files in the current directory, finding and displaying hidden characters such as tabs (\t), end of lines (\n), and so on.

JDT

© 2007-2008 - John Dixon Technology Ltd

Privacy Statement

Terms & Conditions