So recently i had the unenviable task of getting a load of files from a site, not in the mood to do this by hand i thought a simple scripted way would exist...and after a bit of faffing about and someone giving me an idea i ended up with a bloody simple solution!
cat htmlpage.html |grep -o 'http://[^"]*' > urlsinthisfile.txt
ive added spurious fileextensions so windows users can follow along.
its elegant and it works...!