Skip to content

A multi-strategy approach to find the absolutely cleanest and most likely canonical URL of any given URL.

License

Notifications You must be signed in to change notification settings

armchairtheorist/true_url

Repository files navigation

Gem Version Code Climate Build Status Coverage Status

TrueURL

TrueURL helps normalize, clean and derive a canonical URL for any given URL. Unlike other similar projects, TrueURL uses a configurable multi-strategy approach, including tailored strategies for specific sites (e.g. YouTube, DailyMotion, Twitter, etc.) as well as general strategies (e.g. rel="canonical", etc.).

Installation

Install the gem from RubyGems:

gem install true_url

If you use Bundler, just add it to your Gemfile and run bundle install

gem 'true_url'

I have only tested this gem on Ruby 2.3.0, but there shouldn't be any reason why it wouldn't work on earlier Ruby versions as well. TrueURL only requires the Addressable gem as a dependency. if page fetching is required, then the HTTP and Nokogiri gems are also required as dependencies.

Usage

x = TrueURL.new("https://youtu.be/RDocnbkHjhI?list=PLs4hTtftqnlAkiQNdWn6bbKUr-P1wuSm0")
puts x.canonical # => https://www.youtube.com/watch?v=RDocnbkHjhI

x = TrueURL.new("http://embed.nicovideo.jp/watch/sm25956031/script?w=490&h=307&redirect=1")
puts x.canonical # => http://www.nicovideo.jp/watch/sm25956031

x = TrueURL.new("http://t.co/fvaGuRa5Za")
puts x.canonical # => http://www.prdaily.com/Main/Articles/3_essential_skills_for_todays_PR_pro__18404.aspx

Other URL Canonicalization Projects (for Ruby)

About

A multi-strategy approach to find the absolutely cleanest and most likely canonical URL of any given URL.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages