HyperY2K
March 5th, 2008, 02:39 PM
I'm trying to parse the results of a directory search (http://www.kvno.de/buerger/arztsuche/index.html)
I'm already able to get a list of all search results via a bash script. For example, such a list:
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162894190
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162893864
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162888239
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162863689
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162835264
I've found a thread "Parsing HTML (http://ubuntuforums.org/showthread.php?t=649379)", but I need a fast solution. I woulrd prefer a solution via bash (with sed and/or awk).
In the end, i need a result list, which retriebs from each detailpage (see above) the following date:
name of doctor (Name des Arztes)
area of expertise (Tätigkeitsbereiche/Fachgebiete)
adress (Straße - Praxis, PLZ - Praxis, Ort Praxis)
The perfect solution would be to get directly an excel spreadsheet.
Any help is wanted :)
I'm already able to get a list of all search results via a bash script. For example, such a list:
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162894190
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162893864
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162888239
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162863689
http://www.kvno.de/buerger/arztsuche/detail1.php?id=162835264
I've found a thread "Parsing HTML (http://ubuntuforums.org/showthread.php?t=649379)", but I need a fast solution. I woulrd prefer a solution via bash (with sed and/or awk).
In the end, i need a result list, which retriebs from each detailpage (see above) the following date:
name of doctor (Name des Arztes)
area of expertise (Tätigkeitsbereiche/Fachgebiete)
adress (Straße - Praxis, PLZ - Praxis, Ort Praxis)
The perfect solution would be to get directly an excel spreadsheet.
Any help is wanted :)