Tag: webcrawler

Sketch of recursive link python webcrawler

As evolution of the previous python snippet, this script allows to search a string inside url content and recursively follow their links. import urllib2 import sys import re def getlinksfromurl(url): linkset = set() try: usock = urllib2.urlopen(url) data = usock.read()

Posted in Internet, Programming, Python Tagged with: , ,

First stone of a Python webcrawler

I’m learning Python, so I’m trying to create a Python webcrawler. The first method allows you to get a collection of unique url links from a given base url. I’m amazed with Python simplicity, this is the first stone of

Posted in In progress, Internet, Programming, Python Tagged with: ,
%d bloggers like this: