bold italics
Character | Example | Definition |
---|---|---|
* | ab | Matches the previous character 0 or more times |
+ | a+b+ | Matches the previous character 1 or more times |
[ ] | [a-z] | Matches any character from a to z |
[^ ]] | [a-z] | Does not matches any character from a to z |
() | (ab) | A grouped subexpression, this are executed first |
| |
(foo|foot)s |
or Matches one of the other expression |
{m,n} | a{2,3} | Matches the preceding character, m to n |
. | b.d | Matches any charater |
^ | ^a | Indicates an expression at the begining of the sting |
\ | ^ | An escape charater |
$ | [A-Z]*$ | Often at the of the expression it matches the end of the string |
?! | ^((?![A-Z]).)*$ | Does not contain seomthing?? expand |
? | (swimming )? pool | makes the previous expression optional |
?? | (swimming )? pool | lazy |
(?=) | A(?=B) | look ahead Matches an A followed by a B: AB, ABC, |
(?!) | A(?!B) | look ahead negatice find a expression A where B *does not * follows |
(?<=) | (?<=B)A | look behind Find Expresion A where B preceds it |
(?<!) | (?<!B)A | look behind negatice find expression A where expression B does not precced |
(?>) | (?>foo|foot)s |
atomic groups a groupe which trows away altenative patterns if the first alternative does not match |
###BeautifulSoup4
It is a Python libraby used for scrapping websites
It probably might have to be installed. I used pip-3.6 install beautifulsoup4
The beautifulSoup librabry creates a data structure out of the html document, enabiling the user to maniputale HTML tags a data objs. This is very useful if one is looking traverse links.
One can create a beautifulSoup object by passing the the html document and a parser.
soup = BaautifulSoup(html_doc, 'html_parser')
one can see the html page with:
print(soup.prettify())