Ferrum - Headless Chrome Ruby API
Ferrum - Headless Chrome Ruby API
Ruby 测试相关
共879Star
详细介绍
Ferrum - fearless Ruby Chrome driver
As simple as Puppeteer, though even simpler.
It is Ruby clean and high-level API to Chrome. Runs headless by default, but you can configure it to run in a non-headless mode. All you need is Ruby and Chrome/Chromium. Ferrum connects to the browser via DevTools Protocol.
Cuprite used to have this code inside in one form or another but the thing is you don't need Capybara if you are going to crawl sites. You crawl, not test. Besides that clean lightweight API to browser is what Ruby was missing, so here it comes.
Vessel high-level web crawling framework based on Ferrum.
Web design by Evrone, what else we build with Ruby on Rails, what else we do at Evrone.
If you like this project, please consider to become a backer on Patreon.
Index
- Customization
- Navigation
- Finders
- Screenshots
- Network
- Mouse
- Keyboard
- Cookies
- Headers
- JavaScript
- Frames
- Dialog
Install
There's no official Chrome or Chromium package for Linux don't install it this way because it either will be outdated or unofficial, both are bad. Download it from official source. Chrome binary should be in the PATH
or BROWSER_PATH
or you can pass it as an option to browser instance :browser_path
.
Add this to your Gemfile:
gem "ferrum"
Navigate to a website and save a screenshot:
browser = Ferrum::Browser.new
browser.goto("https://google.com")
browser.screenshot(path: "google.png")
browser.quit
Interact with a page:
browser = Ferrum::Browser.new
browser.goto("https://google.com")
input = browser.at_xpath("//div[@id='searchform']/form//input[@type='text']")
input.focus.type("Ruby headless driver for Chrome", :Enter)
browser.at_css("a > h3").text # => "route/ferrum: Ruby Chrome/Chromium driver - GitHub"
browser.quit
Evaluate some JavaScript and get full width/height:
browser = Ferrum::Browser.new
browser.goto("https://www.google.com/search?q=Ruby+headless+driver+for+Capybara")
width, height = browser.evaluate <<~JS
[document.documentElement.offsetWidth,
document.documentElement.offsetHeight]
JS
# => [1024, 1931]
browser.quit
Do any mouse movements you like:
# Trace a 100x100 square
browser = Ferrum::Browser.new
browser.goto("https://google.com")
browser.mouse
.move(x: 0, y: 0)
.down
.move(x: 0, y: 100)
.move(x: 100, y: 100)
.move(x: 100, y: 0)
.move(x: 0, y: 0)
.up
browser.quit
Customization
You can customize options with the following code in your test setup:
Ferrum::Browser.new(options)
- options
Hash
:headless
(Boolean) - Set browser as headless or not,true
by default.:window_size
(Array) - The dimensions of the browser window in which to test, expressed as a 2-element array, e.g. [1024, 768]. Default: [1024, 768]:extensions
(Array[String | Hash]) - An array of paths to files or JS source code to be preloaded into the browser e.g.:["/path/to/script.js", { source: "window.secret = 'top'" }]
:logger
(Object responding toputs
) - When present, debug output is written to this object.:slowmo
(Integer | Float) - Set a delay to wait before sending command. Usefull companion of headless option, so that you have time to see changes.:timeout
(Numeric) - The number of seconds we'll wait for a response when communicating with browser. Default is 5.:js_errors
(Boolean) - When true, JavaScript errors get re-raised in Ruby.:browser_name
(Symbol) -:chrome
by default, only experimental support for:firefox
for now.:browser_path
(String) - Path to chrome binary, you can also set ENV variable asBROWSER_PATH=some/path/chrome bundle exec rspec
.:browser_options
(Hash) - Additional command line options, see them all e.g.{ "ignore-certificate-errors" => nil }
:port
(Integer) - Remote debugging port for headless Chrome:host
(String) - Remote debugging address for headless Chrome:url
(String) - URL for a running instance of Chrome. If this is set, a browser process will not be spawned.:process_timeout
(Integer) - How long to wait for the Chrome process to respond on startup:ws_max_receive_size
(Integer) - How big messages to accept from Chrome over the web socket, in bytes. Defaults to 64MB. Incoming messages larger than this will cause aFerrum::DeadBrowserError
.
The API below is for master branch and a subject to change before 1.0
Navigation
String
goto(url) : Navigate page to.
- url
String
The url should include scheme unless you setbase_url
when configuring driver.
browser.goto("https://github.com/")
back
Navigate to the previous page in history.
browser.goto("https://github.com/")
browser.at_xpath("//a").click
browser.back
forward
Navigate to the next page in history.
browser.goto("https://github.com/")
browser.at_xpath("//a").click
browser.back
browser.forward
refresh
Reload current page.
browser.goto("https://github.com/")
browser.refresh
stop
Stop all navigations and loading pending resources on the page
browser.goto("https://github.com/")
browser.stop
Finders
Node
| nil
at_css(selector, **options) : Find node by selector. Runs document.querySelector
within the document or provided node.
- selector
String
- options
Hash
- :within
Node
|nil
- :within
browser.goto("https://github.com/")
browser.at_css("a[aria-label='Issues you created']") # => Node
Array<Node>
| []
css(selector, **options) : Find nodes by selector. The method runs document.querySelectorAll
within the document or provided node.
- selector
String
- options
Hash
- :within
Node
|nil
- :within
browser.goto("https://github.com/")
browser.css("a[aria-label='Issues you created']") # => [Node]
Node
| nil
at_xpath(selector, **options) : Find node by xpath.
- selector
String
- options
Hash
- :within
Node
|nil
- :within
browser.goto("https://github.com/")
browser.at_xpath("//a[@aria-label='Issues you created']") # => Node
Array<Node>
| []
xpath(selector, **options) : Find nodes by xpath.
- selector
String
- options
Hash
- :within
Node
|nil
- :within
browser.goto("https://github.com/")
browser.xpath("//a[@aria-label='Issues you created']") # => [Node]
String
current_url : Returns current top window location href.
browser.goto("https://google.com/")
browser.current_url # => "https://www.google.com/"
String
current_title : Returns current top window title
browser.goto("https://google.com/")
browser.current_title # => "Google"
String
body : Returns current page's html.
browser.goto("https://google.com/")
browser.body # => '<html itemscope="" itemtype="http://schema.org/WebPage" lang="ru"><head>...
Screenshots
String
| Integer
screenshot(**options) : Saves screenshot on a disk or returns it as base64.
- options
Hash
- :path
String
to save a screenshot on the disk.:encoding
will be set to:binary
automatically - :encoding
Symbol
:base64
|:binary
you can set it to return image as Base64 - :format
String
"jpeg" | "png" - :quality
Integer
0-100 works for jpeg only - :full
Boolean
whether you need full page screenshot or a viewport - :selector
String
css selector for given element - :scale
Float
zoom in/out
- :path
browser.goto("https://google.com/")
# Save on the disk in PNG
browser.screenshot(path: "google.png") # => 134660
# Save on the disk in JPG
browser.screenshot(path: "google.jpg") # => 30902
# Save to Base64 the whole page not only viewport and reduce quality
browser.screenshot(full: true, quality: 60) # "iVBORw0KGgoAAAANSUhEUgAABAAAAAMACAYAAAC6uhUNAAAAAXNSR0IArs4c6Q...
String
| Integer
pdf(**options) : Saves PDF on a disk or returns it as base64.
- options
Hash
-
:path
String
to save a pdf on the disk.:encoding
will be set to:binary
automatically -
:encoding
Symbol
:base64
|:binary
you can set it to return pdf as Base64 -
:landscape
Boolean
paper orientation. Defaults to false. -
:scale
Float
zoom in/out -
:format
symbol
standard paper sizes :letter, :legal, :tabloid, :ledger, :A0, :A1, :A2, :A3, :A4, :A5, :A6 -
:paper_width
Float
set paper width -
:paper_height
Float
set paper height -
See other native options you can pass
-
browser.goto("https://google.com/")
# Save to disk as a PDF
browser.pdf(path: "google.pdf", paper_width: 1.0, paper_height: 1.0) # => 14983
Network
browser.network
Array<Network::Exchange>
traffic Returns all information about network traffic as Network::Exchange
instance which in general is a wrapper around request
, response
and error
.
browser.goto("https://github.com/")
browser.network.traffic # => [#<Ferrum::Network::Exchange, ...]
Network::Request
request : Page request of the main frame.
browser.goto("https://github.com/")
browser.network.request # => #<Ferrum::Network::Request...
Network::Response
response : Page response of the main frame.
browser.goto("https://github.com/")
browser.network.response # => #<Ferrum::Network::Response...
Integer
status : Contains the status code of the main page response (e.g., 200 for a success). This is just a shortcut for response.status
.
browser.goto("https://github.com/")
browser.network.status # => 200
wait_for_idle(**options)
Waits for network idle or raises Ferrum::TimeoutError
error
- options
Hash
- :connections
Integer
how many connections are allowed for network to be idling,0
by default - :duration
Float
sleep for given amount of time and check again,0.05
by default - :timeout
Float
during what time we try to check idle,browser.timeout
by default
- :connections
browser.goto("https://example.com/")
browser.at_xpath("//a[text() = 'No UI changes button']").click
browser.network.wait_for_idle
clear(type)
Clear browser's cache or collected traffic.
- type
Symbol
it is either:traffic
or:cache
traffic = browser.network.traffic # => []
browser.goto("https://github.com/")
traffic.size # => 51
browser.network.clear(:traffic)
traffic.size # => 0
intercept(**options)
Set request interception for given options. This method is only sets request interception, you should use on
callback to catch requests and abort or continue them.
- options
Hash
- :pattern
String
* by default - :resource_type
Symbol
one of the resource types
- :pattern
browser = Ferrum::Browser.new
browser.network.intercept
browser.on(:request) do |request|
if request.match?(/bla-bla/)
request.abort
elsif request.match?(/lorem/)
request.respond(body: "Lorem ipsum")
else
request.continue
end
end
browser.goto("https://google.com")
authorize(**options)
If site uses authorization you can provide credentials using this method.
- options
Hash
- :type
Symbol
:server
|:proxy
site or proxy authorization - :user
String
- :password
String
- :type
browser.network.authorize(user: "login", password: "pass")
browser.goto("http://example.com/authenticated")
puts browser.network.status # => 200
puts browser.body # => Welcome, authenticated client
Mouse
browser.mouse
scroll_to(x, y)
Scroll page to a given x, y
- x
Integer
the pixel along the horizontal axis of the document that you want displayed in the upper left - y
Integer
the pixel along the vertical axis of the document that you want displayed in the upper left
browser.goto("https://www.google.com/search?q=Ruby+headless+driver+for+Capybara")
browser.mouse.scroll_to(0, 400)
Mouse
click(**options) : Click given coordinates, fires mouse move, down and up events.
- options
Hash
- :x
Integer
- :y
Integer
- :delay
Float
defaults to 0. Delay between mouse down and mouse up events - :button
Symbol
:left | :right, defaults to :left - :count
Integer
defaults to 1 - :modifiers
Integer
bitfield for key modifiers. Seekeyboard.modifiers
- :x
Mouse
down(**options) : Mouse down for given coordinates.
- options
Hash
- :button
Symbol
:left | :right, defaults to :left - :count
Integer
defaults to 1 - :modifiers
Integer
bitfield for key modifiers. Seekeyboard.modifiers
- :button
Mouse
up(**options) : Mouse up for given coordinates.
- options
Hash
- :button
Symbol
:left | :right, defaults to :left - :count
Integer
defaults to 1 - :modifiers
Integer
bitfield for key modifiers. Seekeyboard.modifiers
- :button
Mouse
move(x:, y:, steps: 1) : Mouse move to given x and y.
- options
Hash
- :x
Integer
- :y
Integer
- :steps
Integer
defaults to 1. Sends intermediate mousemove events.
- :x
Keyboard
browser.keyboard
Keyboard
down(key) : Dispatches a keydown event.
- key
String
|Symbol
Name of key such as "a", :enter, :backspace
Keyboard
up(key) : Dispatches a keyup event.
- key
String
|Symbol
Name of key such as "b", :enter, :backspace
Keyboard
type(*keys) : Sends a keydown, keypress/input, and keyup event for each character in the text.
- text
String
|Array<String> | Array<Symbol>
A text to type into a focused element,[:Shift, "s"], "tring"
Integer
modifiers(keys) : Returns bitfield for a given keys
- keys
Array<Symbol>
:alt | :ctrl | :command | :shift
Cookies
browser.cookies
Hash<String, Cookie>
all : Returns cookies hash
browser.cookies.all # => {"NID"=>#<Ferrum::Cookies::Cookie:0x0000558624b37a40 @attributes={"name"=>"NID", "value"=>"...", "domain"=>".google.com", "path"=>"/", "expires"=>1583211046.575681, "size"=>178, "httpOnly"=>true, "secure"=>false, "session"=>false}>}
: Cookie
Returns cookie
- value
String
browser.cookies["NID"] # => <Ferrum::Cookies::Cookie:0x0000558624b67a88 @attributes={"name"=>"NID", "value"=>"...", "domain"=>".google.com", "path"=>"/", "expires"=>1583211046.575681, "size"=>178, "httpOnly"=>true, "secure"=>false, "session"=>false}>
Boolean
set(**options) : Sets given values as cookie
- options
Hash
- :name
String
- :value
String
- :domain
String
- :expires
Integer
- :samesite
String
- :httponly
Boolean
- :name
browser.cookies.set(name: "stealth", value: "omg", domain: "google.com") # => true
Boolean
remove(**options) : Removes given cookie
- options
Hash
- :name
String
- :domain
String
- :url
String
- :name
browser.cookies.remove(name: "stealth", domain: "google.com") # => true
Boolean
clear : Removes all cookies for current page
browser.cookies.clear # => true
Headers
browser.headers
Hash
get : Get all headers
Boolean
set(headers) : Set given headers. Eventually clear all headers and set given ones.
- headers
Hash
key-value pairs for example"User-Agent" => "Browser"
Boolean
add(headers) : Adds given headers to already set ones.
- headers
Hash
key-value pairs for example"Referer" => "http://example.com"
Boolean
clear : Clear all headers.
JavaScript
evaluate(expression, *args)
Evaluate and return result for given JS expression
- expression
String
should be valid JavaScript - args
Object
you can pass arguments, though it should be a validNode
or a simple value.
browser.evaluate("[window.scrollX, window.scrollY]")
evaluate_async(expression, wait_time, *args)
Evaluate asynchronous expression and return result
- expression
String
should be valid JavaScript - wait_time How long we should wait for Promise to resolve or reject
- args
Object
you can pass arguments, though it should be a validNode
or a simple value.
browser.evaluate_async(%(arguments[0]({foo: "bar"})), 5) # => { "foo" => "bar" }
execute(expression, *args)
Execute expression. Doesn't return the result
- expression
String
should be valid JavaScript - args
Object
you can pass arguments, though it should be a validNode
or a simple value.
browser.execute(%(1 + 1)) # => true
Boolean
add_script_tag(**options) : - options
Hash
- :url
String
- :path
String
- :content
String
- :type
String
-text/javascript
by default
- :url
browser.add_script_tag(url: "http://example.com/stylesheet.css") # => true
Boolean
add_style_tag(**options) : - options
Hash
- :url
String
- :path
String
- :content
String
- :url
browser.add_style_tag(content: "h1 { font-size: 40px; }") # => true
Boolean
bypass_csp(enabled) : - enabled
Boolean
,true
by default
browser.bypass_csp # => true
browser.goto("https://github.com/ruby-concurrency/concurrent-ruby/blob/master/docs-source/promises.in.md")
browser.refresh
browser.add_script_tag(content: "window.__injected = 42")
browser.evaluate("window.__injected") # => 42
Frames
frames
main_frame
frame_by
Play around inside given frame
browser.goto("https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe")
frame = browser.frames[1]
puts frame.title # => HTML Demo: <iframe>
puts frame.url # => https://interactive-examples.mdn.mozilla.net/pages/tabbed/iframe.html
Dialog
accept(text)
Accept dialog with given text or default prompt if applicable
- text
String
dismiss
Dismiss dialog
browser = Ferrum::Browser.new
browser.on(:dialog) do |dialog|
if dialog.match?(/bla-bla/)
dialog.accept
else
dialog.dismiss
end
end
browser.goto("https://google.com")
Thread safety
Ferrum is fully thread-safe. You can create one browser or a few as you wish and start playing around using threads. Example below shows how to create a few pages which share the same context. Context is similar to an incognito profile but you can have more than one, think of it like it's independent browser session:
browser = Ferrum::Browser.new
context = browser.contexts.create
t1 = Thread.new(context) do |c|
page = c.create_page
page.goto("https://www.google.com/search?q=Ruby+headless+driver+for+Capybara")
page.screenshot(path: "t1.png")
end
t2 = Thread.new(context) do |c|
page = c.create_page
page.goto("https://www.google.com/search?q=Ruby+static+typing")
page.screenshot(path: "t2.png")
end
t1.join
t2.join
context.dispose
browser.quit
or you can create two independent contexts:
browser = Ferrum::Browser.new
t1 = Thread.new(browser) do |b|
context = b.contexts.create
page = context.create_page
page.goto("https://www.google.com/search?q=Ruby+headless+driver+for+Capybara")
page.screenshot(path: "t1.png")
context.dispose
end
t2 = Thread.new(browser) do |b|
context = b.contexts.create
page = context.create_page
page.goto("https://www.google.com/search?q=Ruby+static+typing")
page.screenshot(path: "t2.png")
context.dispose
end
t1.join
t2.join
browser.quit
-
65209 Star
-
19 Star
-
1044 Star
-
0 Star
-
4727 Star