User Agent Parser Comparison

This application was put together to assist us in making a decision on which UserAgent parsing library to use, and then modified some to be a bit more user friendly to encourage use among the different parsing library maintainers. It's still a bit work in progress, but in its current state it's very useful at benchmarking the different parsing libraries (across different languages) and comparing the results of the parsers against "known" useragents (right now these come from the parser libraries' unit tests, but a known list could be built and tested against).

This is loosely based on a benchmark utility that has been around for a while (but not currently maintained): https://github.com/kenjis/user-agent-parser-benchmarks

Right now this application is entirely a CLI application. Most of the data generated by the application is stored as JSON files, so it's certainly possible that a Web UI could be built around the data files which might make analysis easier.

The CLI is written in PHP (using the Symfony Console component), but the parsers (and tests) can be written in any language. Currently, most of the parsers are written in PHP, but the structure of the application (and how it calls the parsers) was intentionally built to support parsers from any language (PRs welcome!).

Latest "Accuracy" Results (2018-06-08)

Parser	Version	Browser Results	Platform Results	Device Results	Accuracy Score
3rd-eden-useragent-js-2	2.3.0	66025/87529 75.43%	44378/57961 76.57%	18918/89948 21.03%	129321/235438 54.93%
browscap-js-1	1.11.6000028	71865/87529 82.1%	54019/57961 93.2%	101811/149420 68.14%	227695/294910 77.21%
browscap-js-2	2.2.6000028	71865/87529 82.1%	54019/57961 93.2%	101811/149420 68.14%	227695/294910 77.21%
browscap-php-2	2.1.1-6000029	72107/87529 82.38%	54093/57961 93.33%	102103/149420 68.33%	228303/294910 77.41%
browscap-php-3-full	3.1.0-6000029	71993/87529 82.25%	54059/57961 93.27%	102128/149420 68.35%	228180/294910 77.37%
browscap-php-3-lite	3.1.0-6000029	31856/87529 36.39%	24905/57961 42.97%	41083/59472 69.08%	97844/204962 47.74%
browscap-php-3-standard	3.1.0-6000029	66827/87529 76.35%	31478/57961 54.31%	51418/59472 86.46%	149723/204962 73.05%
browscap-php-4-full	4.1.0-6000029	71993/87529 82.25%	54059/57961 93.27%	102128/149420 68.35%	228180/294910 77.37%
crossjoin-1	v1.0.6-6000029	71990/87529 82.25%	54060/57961 93.27%	102127/149420 68.35%	228177/294910 77.37%
crossjoin-2	v2.0.5-6000029	71776/87529 82%	54067/57961 93.28%	101705/149420 68.07%	227548/294910 77.16%
crossjoin-3	v3.0.5-6000029	71776/87529 82%	54067/57961 93.28%	101705/149420 68.07%	227548/294910 77.16%
device-detector-node-js-0	0.1.38	63203/87541 72.2%	48193/57973 83.13%	91570/149432 61.28%	202966/294946 68.81%
donatj-0	v0.9.0	52365/87529 59.83%	19322/29537 65.42%	0/0 0.0%	71687/117066 61.24%
endorphin-3	3.0.1	38358/87529 43.82%	36046/57961 62.19%	32979/104316 31.61%	107383/249806 42.99%
jenssegers-agent-2	v2.6.0	41855/87529 47.82%	48871/57961 84.32%	58944/104316 56.51%	149670/249806 59.91%
php-get-browser	7.2.4-6000029	72123/87529 82.4%	54093/57961 93.33%	102109/149420 68.34%	228325/294910 77.42%
piwik-device-detector-3	3.10.2	64775/87529 74%	49654/57961 85.67%	119636/149420 80.07%	234065/294910 79.37%
sinergi-6	6.1.2	42353/87529 48.39%	46034/57961 79.42%	30924/74417 41.56%	119311/219907 54.26%
ua-parser-php-3	v3.5.0-ff9a4ef	66453/87529 75.92%	29412/57961 50.74%	62414/89948 69.39%	158279/235438 67.23%
uaparser-js-0	0.7.18	53226/87529 60.81%	45505/57961 78.51%	87067/149420 58.27%	185798/294910 63%
whichbrowser-js-0	0.2.9	67142/87529 76.71%	53153/57961 91.7%	120824/149420 80.86%	241119/294910 81.76%
whichbrowser-php-2	v2.0.32	67140/87529 76.71%	53157/57961 91.71%	120860/149420 80.89%	241157/294910 81.77%
woothee-js-1	1.7.0	41103/87529 46.96%	48388/57961 83.48%	18853/29899 63.06%	108344/175389 61.77%
woothee-php-1	v1.7.0	41009/87529 46.85%	48463/57961 83.61%	18910/29899 63.25%	108382/175389 61.8%
zsxsoft-php-1	1.4	61496/87529 70.26%	46515/57961 80.25%	39867/89948 44.32%	147878/235438 62.81%

View Full Results

Latest Benchmark Results

Performed with the following command:

./bin/console benchmark ./files/top-1000-201806.txt --iterations=10

VMWare ESX Virtual Server

2.33 GHz Intel Xeon E5410, 4 GB RAM, Magnetic Disk Drive (RAID), PHP 7.0.22 (2018-01-02)

Parser	Average Init Time	Average Parse Time	Average Extra Time	Average Memory Used
browscap-js-1	0.066s	58.705s	0.406s	108.52 M
browscap-php-2	1.03s	72.901s	0.225s	141.64 M
browscap-php-3-full	0.043s	53.877s	0.166s	4.55 M
browscap-php-3-lite	0.032s	2.07s	0.13s	1.37 M
browscap-php-3-standard	0.036s	20.131s	0.139s	2.68 M
crossjoin-1	0.019s	48.111s	0.149s	869.92 K
crossjoin-2	0.026s	108.172s	0.149s	1.04 M
crossjoin-3	0.035s	107.96s	0.148s	1.01 M
php-get-browser	0.029s	118.647s	1.93s	391.2 K
piwik-device-detector-3	0.518s	10.882s	0.164s	3.02 M
ua-parser-php-3	0.027s	3.584s	0.149s	1.56 M
whichbrowser-2	0.039s	4.478s	0.157s	16.53 M
woothee-php-1	0.029s	0.079s	0.127s	822.23 K
zsxsoft-php-1	0.012s	1.418s	0.132s	982.43 K
donatj-0	0.003s	0.205s	0.13s	513.16 K
endorphin-3	0.017s	1.201s	0.131s	551.91 K
jenssegers-agent-2	0.012s	43.464s	0.133s	797.51 K
sinergi-6	0.007s	0.02s	0.122s	509.16 K
uaparser-js-0	0.025s	0.097s	0.276s	5.83 M

MacBook Pro

2.9 GHz Intel Core i7, 16 GB RAM, SSD Drive, PHP 7.2.4 (2018-06-10)

Parser	Average Init Time	Average Parse Time	Average Extra Time	Average Memory Used
3rd-eden-useragent-js-2	0.157s	0.228s	0.4s	16.02 M
browscap-js-1	0.024s	35.534s	0.127s	61.57 M
browscap-js-2	0.029s	36.626s	0.122s	59.71 M
browscap-php-2	0.686s	6.907s	0.083s	145.39 M
browscap-php-3-full	0.012s	8.288s	0.072s	4.65 M
browscap-php-3-lite	0.012s	0.474s	0.054s	1.23 M
browscap-php-3-standard	0.011s	3.52s	0.062s	2.67 M
browscap-php-4-full	0.026s	14.444s	0.095s	4.64 M
crossjoin-1	0.008s	9.096s	0.087s	748.77 K
crossjoin-2	0.013s	28.353s	0.06s	1023.48 K
crossjoin-3	0.016s	29.778s	0.07s	1002.4 K
donatj-0	0.002s	0.01s	0.051s	534.52 K
endorphin-3	0.006s	0.326s	0.051s	557.28 K
jenssegers-agent-2	0.014s	0.892s	0.052s	871.7 K
php-get-browser	0.005s	108.675s	1.638s	428.3 K
piwik-device-detector-3	0.08s	0.892s	0.057s	3.31 M
sinergi-6	0.002s	0.001s	0.049s	529.38 K
ua-parser-php-3	0.027s	0.388s	0.051s	1.65 M
uaparser-js-0	0.011s	0.042s	0.1s	7.64 M
whichbrowser-js-0	0.233s	0.567s	0.1s	30.06 M
whichbrowser-php-2	0.025s	0.403s	0.057s	17.69 M
woothee-js-1	0.007s	0.009s	0.092s	6.27 M
woothee-php-1	0.008s	0.007s	0.05s	804.58 K
zsxsoft-php-1	0.005s	0.066s	0.051s	970.41 K

Windows 10 system requirements

Enabling sqlite php extension.

Enabling wget command.

How To Use

Clone this repo, then install composer dependencies (if composer isn't installed on your system, reference http://getcomposer.org):

composer install

Prepare the parsers and test suites:

./bin/prepare

This will initialize each of the parsers and test suites. This may require your system to have certain commands accessible on the path, like composer or npm. This process will need to be improved to be more portable.

Run a comparison:

./bin/console compare

This will prompt you to select which Test Suite to use, as well as which parser you want to test against (multiples of both can be selected). After the tests are performed, a normalization step takes place where the values of the different properties will be normalized to make comparison more relevant (special characters and spaces replaced, lower-cased, various mappings performed). After that, the results can be analyzed.

Perform a benchmark:

./bin/console benchmark path/to/file --iterations=5

This will prompt you to select which parsers to benchmark, then run the parsers against the specified file (needs to contain the useragents to parse, separated by new lines). After the benchmark is performed, a table like the one above will be output showing the time taken for each parser, as well as the memory consumed (reported by the parser script itself).

Parse some useragents:

./bin/console parse path/to/file

Parses the user-agents in the specified file and outputs a table showing the agent and the parsed properties. Can dump this output to CSV format.

How Parsers are Compared

Accuracy

Every parser extracts different pieces of data and formats/names the data differently, so comparing the output of one parser to another isn't an easy thing to do. Obviously you should pick a parser that extracts the data that you need, not the one that scores the best here. That said, we did want a way to compare the relative accuracy of a parser when it's run against useragents for which the properties are known. In order to do this across different parsers of varying "complete-ness" we selected just a handful of properties that are shared across most parsers we've looked at so far (Browser Name, Browser Version, Platform/OS Name, Platform/OS Version, Device Name, Device Brand, Device Type, Device "Is Mobile?" Flag). This list is subject to change, and perhaps even something that we could make configurable for any given comparison.

Each of the property values are then normalized, mainly by lowercasing the value and stripping out special characters and spaces ("google-bot" and "Google Bot" would become just "googlebot"). Unfortunately that's not enough, as parsers will name some of the common values differently, so there's a mapping stage of normalization. This handles cases like "IE" and "Internet Explorer" referencing the same value. This is done at the "data source" level, meaning, where the parser gets its useragent data from (some parsers use a general data source, like the data from the Browscap project, others use their own). While this handles a lot of data mismatches, it doesn't deal with all of it, especially when we look at Device Name parsing.

For this reason, the report you see above was broken into different sections, showing how well a parser interpreted the Browser, Platform/OS and Device values. Typically the Device section is the lowest accuracy for a parser (except against its own test suite) due to the inconsistencies in naming devices and because it contains four properties (that must match) instead of just two.

The "Accuracy Score"

To account for parsers (and test suites) that don't contain certain pieces of data, we award a "point" to a parser when it correctly identifies a property, but don't penalize the parser if it (as a whole) doesn't provide a property that the test suite it is run against does. For example, the Woothee parser doesn't contain any device data except for the "type", so for the Device section it's only scored on whether or not it parsed the "type" property correctly. On the flip side of that, parsers aren't penalized for providing data that a test suite doesn't test for.

Benchmarking

The benchmark command doesn't do anything special other than run a parser against a list of useragents a specified number of times. The timing information is collected by the script that runs the parser and the benchmark script, just in case there is a lot of time spent in application startup that can't be measured by the parser script itself (an example of this is PHP's native get_browser function. When PHP is configured to load a browscap.ini file, it reads the entire file at PHP startup, which can't be measured from within a PHP script). We've broken out the times so it's easy to tell where time is spent for any given parser. This includes an "initialization time" cost and the actual "parse time" for each useragent parsed. Any other costs that the runner script imposes that aren't measured as part of the other two times are measured in the "extra time" column.

The Parsers

The parsers can be any useragent parser from any language (web API based parsers are possible, though rate limits and network traffic should be considered for some of the larger test suites). All this application needs from a parser is an executable script named "parse" that accepts a filename as the first parameter (which contains the useragents to parse, one on each line of the file) and returns the parsed data in the following JSON format:

{
    "results": [
        {
            "useragent": "Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405",
            "parsed": {
                "browser": {
                    "name": "Mobile Safari UIWebView",
                    "version": "0.0"
                },
                "platform": {
                    "name": "iOS",
                    "version": "unknown"
                },
                "device": {
                    "name": "iPad",
                    "brand": "Apple Inc",
                    "type": "Tablet",
                    "ismobile": "true"
                }
            },
            "time": 0.003
        }
    ],
    "parse_time": 0.105,
    "init_time": 0.003,
    "memory_used": 3456745
}

Currently this is the list of parsers included:

Javascript

3rd-Eden Useragent 2.x (https://github.com/3rd-Eden/useragent)
Browscap JS 1.x (https://github.com/mimmi20/browscap-js/tree/v1)
Browscap JS 2.x (https://github.com/mimmi20/browscap-js)
Device Detector for Nodejs 0.x (https://github.com/MiGatoSeneca/device-detector-node)
UAParser.js (https://github.com/faisalman/ua-parser-js)
WhichBrowser JS 0.x (https://github.com/WhichBrowser/Parser-JavaScript)
Woothee JS 1.x (https://github.com/woothee/woothee-js)

PHP

Browscap PHP 2.x (https://github.com/browscap/browscap-php/tree/2.x)
Browscap PHP 3.x (Full, Standard and Lite) (https://github.com/browscap/browscap-php/tree/3.1.0)
Browscap PHP 4.x (Full) (https://github.com/browscap/browscap-php)
Crossjoin Browscap 1.x (https://github.com/crossjoin/Browscap/tree/1.x)
Crossjoin Browscap 2.x (https://github.com/crossjoin/Browscap/tree/2.x)
Crossjoin Browscap 3.x (PHP7 only) (https://github.com/crossjoin/Browscap/tree/3.x)
BrowserDetector 5.x (https://github.com/mimmi20/BrowserDetector)
Donatj UserAgent Parser 0.x (https://github.com/donatj/PhpUserAgent)
Endorphin Browser Detector 3.x (https://github.com/endorphin-studio/browser-detector)
Jenssegers Agent 2.x (https://github.com/jenssegers/agent)
PHP's Native get_browser (https://secure.php.net/get_browser) - It is strongly recommended that this parser isn't used for large useragent lists unless you're on PHP 7.0.15/7.1.1 or later. It is much too slow otherwise.
Piwik Device Detector 3.x (https://github.com/piwik/device-detector)
Sinergi Browser Detector 6.x (https://github.com/sinergi/php-browser-detector)
UA Parser PHP 3.x (https://github.com/ua-parser/uap-php)
WhichBrowser 2.x (https://github.com/WhichBrowser/Parser)
Wolfcast BrowserDetection 2.x (https://github.com/Wolfcast/BrowserDetection)
Woothee PHP 1.x (https://github.com/woothee/woothee-php)
Yzalis UA-Parser 0.x (https://github.com/yzalis/UAParser)
ZSXSoft PHP-UserAgent 1.x (https://github.com/zsxsoft/php-useragent)

The Test Suites

Like the parsers, there is no language requirement for a test suite to be included here. The only requirement is that an executable script named build can be called and returns JSON in the following format:

{
    "Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405": {
        "browser": {
            "name": "Mobile Safari UIWebView",
            "version": "0.0"
        },
        "platform": {
            "name": "iOS",
            "version": "unknown"
        },
        "device": {
            "name": "iPad",
            "brand": "Apple Inc",
            "type": "Tablet",
            "ismobile": "true"
        }
    }
}

Due to the nature of this format, it is a requirement that any given unique useragent has only one set of parsed fields. If a test suite contains different sets of values for a single useragent (across different files, or because other headers are expected to be present), then these either need to be excluded or merged together.

These are the test suites that are currently included:

Browscap (https://github.com/browscap/browscap)
Donatj (https://github.com/donatj/PhpUserAgent)
Endorphin (https://github.com/endorphin-studio/browser-detector)
Piwik Device Detector (https://github.com/piwik/device-detector)
Sinergi (https://github.com/sinergi/php-browser-detector)
UA Parser (https://github.com/ua-parser/uap-core)
UAParser.js (https://github.com/faisalman/ua-parser-js)
WhichBrowser (https://github.com/WhichBrowser/Parser)
Woothee (https://github.com/woothee/woothee)
ZSXSoft (https://github.com/zsxsoft/php-useragent)

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
bin		bin
data/test-runs		data/test-runs
files		files
mappings		mappings
parsers		parsers
src/Command		src/Command
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.php_cs		.php_cs
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
composer.json		composer.json
composer.lock		composer.lock
package.json		package.json
phpstan.neon		phpstan.neon
psalm-baseline.xml		psalm-baseline.xml
psalm.xml		psalm.xml
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

User Agent Parser Comparison

Latest "Accuracy" Results (2018-06-08)

Latest Benchmark Results

VMWare ESX Virtual Server

MacBook Pro

Windows 10 system requirements

How To Use

Run a comparison:

Perform a benchmark:

Parse some useragents:

How Parsers are Compared

Accuracy

The "Accuracy Score"

Benchmarking

The Parsers

Javascript

PHP

The Test Suites

About

Releases

Packages

Contributors 5

Languages

License

diablomedia/useragent-parser-comparison

Folders and files

Latest commit

History

Repository files navigation

User Agent Parser Comparison

Latest "Accuracy" Results (2018-06-08)

Latest Benchmark Results

VMWare ESX Virtual Server

MacBook Pro

Windows 10 system requirements

How To Use

Run a comparison:

Perform a benchmark:

Parse some useragents:

How Parsers are Compared

Accuracy

The "Accuracy Score"

Benchmarking

The Parsers

Javascript

PHP

The Test Suites

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages