wget login pages

May 15th, 2008 by webstersprodigy

how do you scrape a page that you have to login to get to?

Well, one way is to save the cookies and use –post-data, though this may depend on how the session is saved.

$ wget http://site/login/index.php –post-data “username=user&password=pass” –save-cookies=cookies.txt –keep-session-cookies

then to grab other pages

$ wget –load-cookies=cookies.txt http://login/someotherpage/index.php

Tags:

Leave a Reply


No computers were harmed in the 0.185 seconds it took to produce this page.