forked from sparklemotion/mechanize
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathEXAMPLES.txt
171 lines (130 loc) · 4.7 KB
/
EXAMPLES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
= WWW::Mechanize examples
== Google
require 'rubygems'
require 'mechanize'
a = WWW::Mechanize.new { |agent|
agent.user_agent_alias = 'Mac Safari'
}
a.get('http://google.com/') do |page|
search_result = page.form_with(:name => 'f') do |search|
search.q = 'Hello world'
end.submit
search_result.links.each do |link|
puts link.text
end
end
== Rubyforge
a = WWW::Mechanize.new
a.get('http://rubyforge.org/') do |page|
# Click the login link
login_page = a.click(page.links.text(/Log In/))
# Submit the login form
my_page = login_page.form_with(:action => '/account/login.php') do |f|
f.form_loginname = ARGV[0]
f.form_pw = ARGV[1]
end.click_button
my_page.links.each do |link|
text = link.text.strip
next unless text.length > 0
puts text
end
end
== File Upload
Upload a file to flickr.
a = WWW::Mechanize.new { |agent|
# Flickr refreshes after login
agent.follow_meta_refresh = true
}
a.get('http://flickr.com/') do |home_page|
signin_page = a.click(home_page.links.text(/Sign In/))
my_page = signin_page.form_with(:name => 'login_form') do |form|
form.login = ARGV[0]
form.passwd = ARGV[1]
end.submit
# Click the upload link
upload_page = a.click(my_page.links.text(/Upload/))
# We want the basic upload page.
upload_page = a.click(upload_page.links.text(/basic Uploader/))
# Upload the file
upload_page.form_with(:method => 'POST') do |upload_form|
upload_form.file_uploads.first.file_name = ARGV[2]
end.submit
end
== Pluggable Parsers
Lets say you want html pages to automatically be parsed with Rubyful Soup.
This example shows you how:
require 'rubygems'
require 'mechanize'
require 'rubyful_soup'
class SoupParser < WWW::Mechanize::Page
attr_reader :soup
def initialize(uri = nil, response = nil, body = nil, code = nil)
@soup = BeautifulSoup.new(body)
super(uri, response, body, code)
end
end
agent = WWW::Mechanize.new
agent.pluggable_parser.html = SoupParser
Now all HTML pages will be parsed with the SoupParser class, and automatically
give you access to a method called 'soup' where you can get access to the
Beautiful Soup for that page.
== Using a proxy
require 'rubygems'
require 'mechanize'
agent = WWW::Mechanize.new
agent.set_proxy('localhost', '8000')
page = agent.get(ARGV[0])
puts page.body
== The transact method
transact runs the given block and then resets the page history. I.e. after the
block has been executed, you're back at the original page; no need count how
many times to call the back method at the end of a loop (while accounting for
possible exceptions).
This example also demonstrates subclassing Mechanize.
require 'mechanize'
class TestMech < WWW::Mechanize
def process
get 'http://rubyforge.org/'
search_form = page.forms.first
search_form.words = 'WWW'
submit search_form
page.links_with(:href => %r{/projects/} ).each do |link|
next if link.href =~ %r{/projects/support/}
puts 'Loading %-30s %s' % [link.href, link.text]
begin
transact do
click link
# Do stuff, maybe click more links.
end
# Now we're back at the original page.
rescue => e
$stderr.puts "#{e.class}: #{e.message}"
end
end
end
end
TestMech.new.process
== Client Certificate Authentication (Mutual Auth)
In most cases a client certificate is created as an additional layer of security
for certain websites. The specific case that this was initially tested on was
for automating the download of archived images from a banks (Wachovia) lockbox
system. Once the certificate is installed into your browser you will have to
export it and split the certificate and private key into separate files. Exported
files are usually in .p12 format (IE 7 & Firefox 2.0) which stands for PKCS #12.
You can convert them from p12 to pem format by using the following commands:
openssl.exe pkcs12 -in input_file.p12 -clcerts -out example.key -nocerts -nodes
openssl.exe pkcs12 -in input_file.p12 -clcerts -out example.cer -nokeys
require 'rubygems'
require 'mechanize'
# create Mechanize instance
agent = WWW::Mechanize.new
# set the path of the certificate file
agent.cert = 'example.cer'
# set the path of the private key file
agent.key = 'example.key'
# get the login form & fill it out with the username/password
login_form = @agent.get("http://example.com/login_page").form('Login')
login_form.Userid = 'TestUser'
login_form.Password = 'TestPassword'
# submit login form
agent.submit(login_form, login_form.buttons.first)