With all the recent talk about fake SSL certs issued by root-level Certificate Authorities at Comodo and DigiNotar and so forth, I thought it’d be interesting to run a little experiment. One thing that these compromises have highlighted is the huge number of root certificate authorities in modern operating systems and browsers. But how many of those are actually in use? How many sites that I visit are certified by each of the roots?

On my OS X Lion install, the Keychain app tells me that I have 175 different roots installed. Some of them are pretty obvious (Verisign, etc.) but some are..well…probably NEVER going to be used by me, ever. Some have suggested that people simply delete root certs for various foreign entities (like CNNIC). But blindly deleting roots could mean that many sites you regularly visit suddenly stop validating.

So I wrote a simple python script. (I briefly considered writing it in perl, just to annoy an office troll, but sanity prevailed).

What this does is go though my Google Chrome “History” file. That’s a SQLite3 file, located in ~/Library/Application\ Support/Google/Chrome/Default/. It finds all URLs which start with “https://”, and calls the command-line OpenSSL utility to show the certificate chain retrieved when contacting that site. Then it sorts them all and outputs them grouped by CA.

Here’s a snippet of the output, run against my history:

Root CAs Used by Hosts in Cache File
/C=US/O=Entrust.net/OU=www.entrust.net/CPS incorp. by ref. (limits liab.)/OU=(c) 1999 Entrust.net Limited/CN=Entrust.net Secure Server Certification Authority
    tools.usps.com
    www.facebook.com
    www.icloud.com
    donotcall.gov
    complaints.donotcall.gov
/C=US/O=Equifax/OU=Equifax Secure Certificate Authority
    docs.google.com
    accounts.youtube.com
    plus.google.com
    picasaweb.google.com
    encrypted.google.com
    chrome.google.com
    adwords.google.com
    bitbucket.org
    gitorious.org
    sites.google.com
/C=US/O=GTE Corporation/OU=GTE CyberTrust Solutions, Inc./CN=GTE CyberTrust Global Root
    discussions.apple.com
    www.usps.com
    help.apple.com
    login.yahoo.com
    fbcdn-sphotos-a.akamaihd.net

It’s pretty simple. And the script itself is a pretty ugly hack. But the output of the script, coupled with some simple grepping and wcing, tells me that out of 93 different SSL sites I’ve visted in the past (howeverlong Chrome keeps history for), they were represented by only 20 different root CAs. Another consultant reported 24 CAs over 292 different domains. I also ran the list against the first 250 reachable hosts in Alexa’s “Top Internet Sites” list, and found only 25 different root CAs in use.

Using the script is simple:

python findroots.py [options]
Options:
  --chrome <filename>
  --firefox <filename>
  --list <filename>

Use –chrome to read a History file from Google Chrome, –firefox to read a places.sqlite file from Mozilla Firefox, and –list to read a plain text file listing of hosts, one host per line.

Click here to download the script.