Author Archives: Troy Hunt

Authentication and the Have I Been Pwned API

Authentication and the Have I Been Pwned API

The very first feature I added to Have I Been Pwned after I launched it back in December 2013 was the public API. My thinking at the time was that it would make the data more easily accessible to more people to go and do awesome things; build mobile clients, integrate into security tools and surface more information to more people to enable them to do positive and constructive things with the data. I highlighted 3 really important attributes at the time of launch:

There is no authentication.

There is no rate limiting.

There is no cost.

One of those changed nearly 3 years ago now - I had to add a rate limit. The other 2 are changing today and I want to clearly explain why.

Identifying Abusive API Usage

Let me start with a graph:

Authentication and the Have I Been Pwned API

This is executions of the V2 API that enables you to search an individual email address. There's 1.06M requests in that 24 hour period with 491k of them in the last 4 hours. Even with the rate limit of 1 request every 1,500ms per IP address enforced, that graph shows a very clear influx of requests peaking at 14k per minute. How? Well let's pull the logs from Cloudflare and see:

Authentication and the Have I Been Pwned API

This is the output of a little log analyser I wrote that breaks requests down by ASN (and other metrics) over the past hour. There were 15,573 requests from AS23969 across 82 unique IP addresses. Have a look at where those IP addresses came from:

Authentication and the Have I Been Pwned API

There is no conceivable way that this is legitimate, organic usage of the API from Thailand. The ASN is owned by TOT Public Company Limited, a local Thai telco that somehow, has ended up with a truckload of IP addresses hitting HIBP at just the right rate to not trigger the rate limit. The next top ASN is Biznet Networks in Indonesia. Then Claro in Brazil. After that there's Digital Ocean and then another Indonesian telco, Telkomnet. It makes for a geographical spread that's entirely inconsistent with legitimate usage of genuine consumers (no, HIBP isn't actually big in Iran!):

Authentication and the Have I Been Pwned API

Late last year after seeing a similar pattern with a well-known hosting provider, I reached out to them to try and better understand what was going on. I provided a bunch of IP addresses which they promptly investigated and reported back to me on:

1- All those servers were compromised. They were either running standalone VPSs or cpanel installations.

2- Most of them were running WordPress or Drupal (I think only 2 were not running any of the two).

3- They all had a malicious cron.php running

This helped me understand the source of the problem, but it didn't get me any closer to actually blocking the abusive behaviour. For the sake of transparency, let me talk about how I tried to tackle this because that will help everyone understand why I've arrived at a very different model to what I started with.

Combating Abuse with Firewall Rules

Firewall rules on Cloudflare are amazingly awesome. It takes just a few seconds to have a rule like this in place:

Authentication and the Have I Been Pwned API

Make more than 40 requests in a minute and you're in the naughty corner for a day. Only thing is, that's IP-based and per the earlier section on abusive patterns, actors with large numbers of IP addresses can largely circumvent this approach. It's still a fantastic turn-key solution that seriously raises the bar for anyone wanting to get around it, but someone determined enough will find a way.

No problems, I'll just take abusive ASNs like the Thai one above and give them the boot. I scripted a lot of them based on patterns in the log files and create a firewall rule like this:

Authentication and the Have I Been Pwned API

That works pretty quickly and is very effective, except for the fact that there's an awful lot of ASNs out there being abused. Plus, it has side-effects I'll come back to shortly too.

So how about looking at user agent strings instead? I mean could always just block the ones bad actors are using, except that was never going to work particularly well for obvious reasons (you can always define whatever one you like). That said, there were a heap of browser UAs which clearly were (almost) never legitimate for a client making API calls. So I blocked these as well:

Authentication and the Have I Been Pwned API

That shouldn't have come as a surprise to anyone as the API docs were actually quite clear about this:

The user agent should accurately describe the nature of the API consumer such that it can be clearly identified in the request. Not doing so may result in the request being blocked.

Problem is, people don't read docs and I ended up with a heap of default user agents (such as curl's) which were summarily blocked. And, of course, the user agent requirement was easily circumvented as I expected it would be and I simply started seeing randomised strings in the UA.

Another approach I toyed with (very transiently) was blocking entire countries from accessing the API. I was always really hesitant to do this, but when 90% of the API traffic was suddenly coming from a country in West Africa, for example, that was a pretty quick win.

I'm only writing about this here now because as the new model comes into place, all of this will be redundant. Plus, I wanted to shed some light on the API behaviour some people may have previously seen which they couldn't quite work out, and that brings me to the next section.

The Impact on Legitimate Usage

The attempts described above to block abuse of the API also blocked a lot of good requests. I feel bad about that because it made something I'd always intended to be easily accessible difficult for some people to use. I hope that by explaining the background here, people will understand why the approaches above were taken and indeed, why the changes I'm going to talk about soon were necessary.

I got way too many emails from people about API requests being blocked to respond to. Often this was due to simply not meeting the API requirements, for example providing a descriptive UA string. Other times it was because they were on the same network as abusive users. There were also those who simply smashed through the rate limit too quickly and got themselves banned for a day. Other times, there were genuine API users in that West African country who found themselves unable to use the service. I was constantly balancing the desire to make the API easily accessible whilst simultaneously trying to ensure it wasn't taken advantage of. In the end, the path forward was clear - the API would need to be authenticated.

The New Model: Authenticated Requests

I held back on this for a long time because adding auth to the API adds a barrier to entry. It also adds coding effort on my end as well as management overhead. However, by earlier this year it became clear that this was the only way forward: requests would have to be auth'd. Doing this solves a heap of problems in one fell swoop:

  1. The rate limit could be applied to an API key thus solving the problem of abusive actors with multiple IP addresses
  2. Abuse associated to an IP, ASN, user agent string or country no longer has to impact other requests matching the same pattern
  3. The rate limit can be just that - a limit rather than also dishing out punishment via the 24 hour block

Making an authenticated call is a piece of cake, you just add an hibp-api-key header as follows:

hibp-api-key: [your key]

However, this wasn't going to completely solve the problem, rather it moved the challenge to the way in which API keys were provisioned. It's no good putting controls around the key itself if a bad actor could just come along and register a heap of them. Anti-automation on the form where a key can be requested is one thing, stopping someone from manually registering, say, 20 of them with different email addresses and massively amplifying their request rate is quite another. I had to raise the bar just high enough to dissuade people from doing this, which brings me to the financial side of things.

There's a US$3.50 per Month Fee to Use the API

Clearly not everyone will be happy with this so let me spend a bit of time here explaining the rationale. This fee is first and foremost to stop abuse of the API. The actors I've seen taking advantage of it are highly unlikely to front up with a credit card and provide what amounts to personally identifiable data (i.e. make a credit card payment) in order to mass enumerate the API.

In choosing the $3.50 figure, I wanted to ensure it was a number that was inconsequential to a legitimate user of the service. That's about what a latte costs at my local coffee shop so spending a few bucks a month to search through billions of records seems like a pretty damn good deal, especially when that rate limit enables 57.6k requests per day.

One thing I want to be crystal clear about here is that the $3.50 fee is no way an attempt to monetise something I always wanted to provide for free. I hope the explanation above helps people understand that, and also the fact the API has run the last 5 and a half years without any auth whatsoever clearly demonstrates that financial gain has never been the intention. Plus, the service I'm using to implement auth and rate limits comes with a direct cost to me:

Authentication and the Have I Been Pwned API

This is from the Azure API Management pricing page which is the service I'm using to provision keys and control rate limits (I'll write a more detailed post on this later on - it's kinda awesome). I chose the $3.50 figure because it represents someone making one million calls. Some people will make much less, some much more - that rate limit represents a possible 1.785 million calls per month. Plus, there's still the costs of function executions, storage queries and egress bandwidth to consider, not to mention the slice of the $3.50 that Stripe takes for processing the payment (all charges are routed through them). The point is that the $3.50 number is pretty much bang on the mark for the cost of providing the service.

What this change does it simultaneously gives me a much higher degree of confidence the API will be used in an ethical fashion whilst also ensuring that those who use it have a much more predictable experience without me dipping deeper and deeper into my own pocket.

The API is Revving to Version 3 (and Has Some Breaking Changes)

With this change, I'm revising the API up to version 3. All documentation on the API page now reflects that and also reflects a few breaking changes, the first of which is obviously the requirement for auth. When using V3, any unauthenticated requests will result in an HTTP 401.

The second breaking change relates to how the versioning is done. Back in 2014, I wrote about how your API versioning is wrong and headlined it with this graphic:

Authentication and the Have I Been Pwned API

I outlined 3 different possible ways of expressing the desired version in API calls, each with their own technical and philosophical pros and cons:

  1. Via the URL
  2. Via a custom request header
  3. Via the accept header

After 4 and a bit years, by far and away the most popular method with an uptake of more than 90% is versioning via the URL. So that's all V3 supports. I don't care about the philosophical arguments to the contrary, I care about working software and in this case, the people have well and truly spoken. I don't want to have to maintain code and provide support for something people barely use when there's a perfectly viable alternative.

Next, I'm inverting the condition expressed in the "truncateResponse" query string. Previously, a call such as this would return all meta data for a breach:


You'd end up with not just the name of the breach, but also how many records were in it, all the impacted data classes, a big long description and a whole bunch of other largely redundant information. I say "redundant" because if you're hitting the API over and over again, you're pulling but the same info for each account that appears in the same breach. Using the "truncateResponse" parameter reduced the response size by 98% but because it wasn't the default, it wasn't used that much. I want to drive the adoption of small responses because not only are they faster for the consumer, they also reduce my bandwidth bill which is one of the most expensive components of HIBP. You can still pull back all the data for each breach if you'd like, you just need to pass "truncateResponse=false" as true is now the default. (Just a note on that: you're far better off making a single call to get all breached sites in the system then referencing that collection by breach name after querying an individual email address.)

I'm also inverting the "includeUnverified" parameter. The original logic for this was that when I launched the concept of unverified breaches, I didn't want existing consumers of the API to suddenly start getting results for breaches which may not be real. However, with the passage of time I've come across a couple of issues with this and the first is that a heap of people consumed the API with the default params (which wouldn't include unverified breaches) and then contacted me asking "why does the API return different results to the front page of HIBP?" The other issue is that I simply haven't flagged very many breaches as unverified and I've also added other classes of breach which deviate from the classic model of loading a single incident clearly attributable to a single site such as the original Adobe breach. There are now spam lists, for example, as well as credential stuffing lists and returning all data by default is much more consistent with the ethos of considering all breached data to be in scope.

The other major thing related to breaking stuff is this:

Versions 1 and 2 of the API for searching breaches and pastes by email address will be disabled in 4 weeks from today on August 18.

I have to do this on an aggressive time frame. Whilst I don't, all the problems mentioned above with abuse of the API continues. When we hit that August due date, the APIs will begin returning HTTP 400 "Bad Request" and that will be the end of them.

One important distinction: this doesn't apply to the APIs that don't pull back information about an email address; the API listing all breaches in the system, for example, is not impacted by any of the changes outlined here. It can be requested with version 3 in the path, but also with previous versions of the API. Because it returns generic, non-personal data it doesn't need to be protected in the same fashion (plus it's really aggressively cached at Cloudflare). Same too for Pwned Passwords - there's absolutely zero impact on that service.

During the next 4 weeks I'll also be getting more aggressive with locking down firewall rules on the previous versions at the first sign of misuse until they're discontinued entirely. They're an easy fix if you're blocked with V2 - get an API key and roll over to V3. Now, about that key...

Protecting the API Key (and How My Problem Becomes Your Problem)

Now that API keys are a thing, let me touch briefly on some of the implications of this as it relates to those of you who've built apps on top of HIBP. And just for context, have a look at the API consumers page to get a sense of the breadth we're talking about; I'll draw some examples out of there.

For code bases such as Brad Dial's Pwny Corral, it's just a matter of adding the hibp-api-key header and a configuration for the key. Users of the script will need to go through the enrolment process to get their own key then they're good to go.

In a case like What's My IP Address' Data Breach Check, we're talking about a website with a search feature that hits their endpoint and then they call HIBP on the server side. The HIBP API key will sit privately on their end and the only thing they'll really need to do is stop people from hammering their service so it doesn't exceed the HIBP rate limit for that key. This is where it becomes their (your) problem rather than mine and that's particularly apparent in the next scenario...

Rich client apps designed for consumer usage such as Outer Corner's Secrets app will need to proxy API hits through their own service. You don't want to push the HIBP API key out with the installer plus you also need to be able to control the rate limit of all your customers so that it doesn't make the service unavailable for others (i.e. one user of Secrets smashes through the rate limit thus making the service unavailable for others).

One last thing on the rate limit: because it's no longer locking you out for a day if exceeded, making too many requests results in a very temporary lack of service (usually single digit seconds). If you're consuming the new auth'd API, handle HTTP 429 responses from HIBP gracefully and ask the user to try again momentarily. Now, with that said, let me give you the code to make it dead easy to both proxy those requests and control the rate at which your subscribers hit the service; here's how to do it with Cloudflare workers and rate limits:

Proxying With a Cloudflare Worker (and Setting Rate Limits)

The fastest way to get up and running with proxying requests to V3 of the HIBP API is with a Cloudflare Worker. This is "serverless code on the edge" or in other words, script that runs on Cloudflare's 180 edge nodes around the world such that when someone makes a request for a particular route, the script kicks in and executes. It's easiest just to have a read of the code below:

Stand up a domain on Cloudflare's free tier (if you're not on there already) then it's $5 per month to send 10M queries through your worker which is obviously way more than you can send to the HIBP API anyway. And while you're there, go and use the firewall rules to lock down a rate limit so your own API isn't hammered too much (keeping in mind some of the challenges I faced when doing this).

The point is that if you need to protect the API key and proxy requests, it's dead simple to do.

"But what if you just..."

I'll get a gazillion suggestions of how I could do this differently. Every single time I talk about the mechanics of how I've built something I always do! The model described in this blog post is the best balance of a whole bunch of different factors; the sustainability of the service, the desire to limit abuse, leveraging the areas my skills lie in, the limited availability of my time and so on and so forth. There are many other factors that also aren't obvious so as much as suggestions for improvements are very welcomed, please keep in mind that they may not work in the broader sense of what's required to run this project.


There's a couple of these and they're largely due to me trying to make sure I get this feature out as early as possible and continue to run things on a shoestring cost wise. Firstly, there's no guarantee of support. We do the same thing with entry-level Report URI pricing and it's simply because it's enormously hard to do with the time constraints of a single person running this. That said, if anything is buggy or broken I definitely want to know about it. Secondly, there's no way to retrieve or rotate the API key. If you extend the one-off subscription you'll get the same key back or if you cancel an existing subscription  and take a new one you'll also get the same key. I'll build out better functionality around this in the future.

I'm sure there'll be others that pop up and I'll expand on the items above if I've missed any here.


The changes I've outlined here strike a balance between making the API available for good purposes, making it harder to use for bad purposes, ensuring stability for all those in the former category and crucially, making it sustainable for me to operate. That last point in particular is critical for me both in terms of reducing abuse and reducing the overhead on me trying to achieve that objective and supporting those who ran into the previously mentioned blocks.

I expect there'll be many requests to change or evolve this model; other payment types, no payment at all for certain individuals or organisations, higher rate limits and so on and so forth. At this stage, my focus is on keeping the service sustainable as Project Svalbard marches forward and once that comes to fruition, I'll be in a much better position to revisit suggestions (also, there's a UserVoice for that). For now, I hope that this change leads to a much more sustainable service for everyone.

Weekly Update 147

Weekly Update 147

So "Plan A" was to publish Pwned Passwords V5 on Tuesday but a last-minute check showed control characters had snuck in due to the quality (or lack thereof) of the source data. Scratch that and go to "Plan B" which was to push them out today but a last-minute check showed that my "improved" export script had screwed up the encoding and every single hash was wrong. "Plan C" is now to push them out on the weekend with everything working correctly. Hopefully. If I don't screw anything up again...

The constant challenge I've faced over the last few years is the massive amount of multi-tasking required to do all the things I'm presently doing. I touched on this in my Project Svalbard blog post and it goes a long to explaining why HIBP needs to grow up into a larger organisation. I quite literally need people to remove the horizontal tabs and get the encoding right; it's such a simple thing but it's so easy to screw up when you're stretched too thin.

Enough about that, this week I'm also talking about Scott's upcoming public Glasgow workshop, more data breaches, Namecheap's faux pas and EVE Online's great security work they've very generously shared publicly.

Weekly Update 147
Weekly Update 147
Weekly Update 147


  1. Scott will be running my Hack Yourself First workshop in Glasgow next week (this is the last stop on the UK tour, get in while you still can!)
  2. Someone also created a website dedicated to him (seems legit!)
  3. The Zhenai breach from 2011 added another 5M records to HIBP (I'm still working through a ridiculously long backlog of breaches...)
  4. I called Namecheap to account for a very misleading post on SSL (to their credit, they've now pulled the piece)
  5. EVE Online published some great material on how they're doing their security things (it's not just the practices I think are great, it's the fact that they're happy to talk about them publicly so that other companies can benefit too)
  6. Shape Security is sponsoring my blog this week (Captcha is no longer enough, they're talking about how Shape Connect blocks automation & improves security instantly, with a 30 minute implementation)

Pwned Passwords, Version 5

Pwned Passwords, Version 5

Almost 2 years ago to the day, I wrote about Passwords Evolved: Authentication Guidance for the Modern Era. This wasn't so much an original work on my behalf as it was a consolidation of advice from the likes of NIST, the NCSC and Microsoft about how we should be doing authentication today. I love that piece because so much of it flies in the face of traditional thinking about passwords, for example:

  1. Don't impose composition rules (upper case, lower case, numbers, etc)
  2. Don't mandate password rotation (enforced changing of it every few months)
  3. Never implement password hints

And of most relevance to the discussion here today, don't allow people to use passwords that have already been exposed in a data breach. Shortly after that blog post I launched Pwned Passwords with 306M passwords from previous breach corpuses. I made the data downloadable and also made it searchable via an API, except there are obvious issues with enabling someone to send passwords to me even if they're hashed as they were in that first instance. Fast forward to Feb last year and with Cloudflare's help, I launched Pwned Passwords version 2 with a k-anonymity model. The data was all still downloadable if you wanted to run the whole thing offline, but k-anonymity also gave people the ability to hit the API without disclosing the original password. Subsequent updates to the corpus of breached passwords saw versions 3 and 4 arrive as more passwords flowed in from new breaches whilst the system also continued to grow and grow:

Today, after another 6 months of collecting passwords, I'm releasing version 5 of the service. During this time I collected 65M passwords from breaches where they were made available in plain text (I don't crack passwords for this service). Due to Pwned Passwords already having 551M records as of V4, increasingly new corpuses of passwords are actually adding very few new ones so V5 contributes an additional... 3,768,890 passwords. That may not seem like a lot in comparison, but my virtue of an entire half year passing I wanted to get the existing public set updated to the current numbers. It doesn't just add new ones though, those 65M occurrences all contribute to the exiting prevalence counts for passwords that have been seen before.

New passwords include such strings as "Mynoob" (seen 1,208 times), "Find_pass" (303 times) and "guns and robots" (134 times). There's often biases in password distribution due to the sources they're obtained from, for example the prevalence of the service's name or other attributes or relationships to the breached site.

The entire 555,278,657 passwords are now available for download if you're running the service offline. If you're using the k-anonymity API then there's nothing more to do - I've already flushed cache at Cloudflare so you're now getting the latest and greatest set of bad passwords. If you want to be sure you're getting the latest data via the API, check the "last-modified" response header has a July date rather than a January date.

And just while I'm here talking about updates to the corpus of Pwned Passwords, I'm really conscious that releases are happening on a half-yearly cadence which means a bunch of new passwords sit on my side for months before anyone can start black-listing them. This is one of the things that's high on my post-Project Svalbard list; I'd love to see a constant firehose of new passwords being integrated into this service. Not six-monthly, not monthly and frankly, not even weekly - I want to see passwords in there as soon as I get them. The shorter the period between a breached password entering circulation and it appearing in Pwned Passwords, the more impact the service can have on the scourge of credential stuffing. Stay tuned!

As time has passed and more organisations have implemented the service, there's been some really fantastic implementations come out of the community. I wrote about a bunch of them last year in my post on Pwned Passwords in Practice, but it's the work they've done at EVE Online that really stands out:

Obviously these are all some of my favourite things (HIBP, 1Password and Report URI), but it's the improvements made to the user selection of passwords that makes me particularly happy:

When we first implemented the check, about 19% of logins were greeted with the message that their password was not safe enough. Today, this has dropped down to around 11-12% and hopefully will continue to go down.

That's a massive drop that has a profoundly positive impact not just on the individuals using EVE Online, but to the company itself too. Account takeover attacks are a massive problem on the web today and if you reduce the proportion of customers using known bad passwords by up to 42%, you make a direct impact on the cost the organisation has to bear when dealing with the problem.

The NTLM hashes have been really well-received too as they've allowed organisations to quickly check the proportion of their Active Directory users with known bad passwords. Consistently, I'm hearing the results of this exercise are... alarming:

I've been really happy to see a bunch of community offerings appear around the NTLM hashes in particular. Most notable is this one by Ryan Newington:

What's great about this work is that not only can it stop people from making bad password choices in the first place, you'll see there's a reference towards the bottom that'll allow you to run it against your entire set of AD users on demand. And just like Pwned Passwords itself, it's 100% free and you can go and grab it all right now.

So that's Pwned Passwords V5 now live. Implement the k-anonymity API with a few lines of code or if you want to run it all offline, download the data directly. Either way, take it and do awesome things with it!

Weekly Update 146

Weekly Update 146

After a very non-stop Cyber Week in Israel, I'm back in Oslo working through the endless emails and other logistics related to Project Svalbard. In my haste this week, I put out a really poorly worded tweet which I've tried to clarify in this week's video. On more positive news, the Austrian government came on board HIBP and my MVP status got renewed for the 9th time. I also wanted to talk this week about some of the stats from HIBP I've been preparing as part of the acquisition. There's a bunch of really interesting numbers in there (for me at least) and rather than just keeping them locked away in an information memorandum, I thought I'd share them with everyone in this week's update.

Weekly Update 146
Weekly Update 146
Weekly Update 146


  1. The Austrian government is now using HIBP to monitor all gov domains across the country (they join the UK, Australia and Spain in utilising this free service)
  2. My MVP status has been renewed, now going into year 9! (this program has been a real defining part of my career)
  3. Shape Security is sponsoring my blog this week (Captcha is no longer enough, they're talking about how Shape Connect blocks automation & improves security instantly, with a 30 minute implementation)

Microsoft MVP Award, Year 9

Microsoft MVP Award, Year 9

I've become especially reflective of my career this year, especially as Project Svalbard marches forward and I look back on what it's taken to get here. Especially as I have more discussions around the various turning points in my professional life, there's one that stands out above most others: my first MVP award.

This is not a path I planned, in fact when I originally got that award I referred to myself as The Accidental MVP. But I also think that's the best way to earn any of the awards I've since received; not by setting out with the award as the goal, but rather focusing on the activities for which the award is granted. I wrote a blog people found useful and I continue to do that today. The first award prompted me to start speaking publicly and obviously that's something I continue to do today too. So, before anyone asks "how do I become a Microsoft MVP", there's your answer. That and a pointer to the page on What it takes to be an MVP.

One last thing to add to that and it's the value of community encouragement. There's no way I would have stuck to this path if it wasn't for all the social media engagement, blog comments and conference selfies. It's hard to express just how what a massive role encouragement plays in keeping me motivated to do this; knowing that your work is valued is just absolutely essential and I still get a kick of seeing messages like this from just last week in Israel:

Incidentally, I'm still a Microsoft Regional Director too which runs as a parallel program. I still don't have a region, I still don't direct anything and I still don't get paid by Microsoft. Everyone with me? Good!

Welcoming the Austrian Government to Have I Been Pwned

Welcoming the Austrian Government to Have I Been Pwned

Early last year, I announced that I was making HIBP data on government domains for the UK and Australia freely accessible to them via searches of their respective TLDs. The Spanish government followed a few months later with each getting unbridled access to search their own domains via an authenticated API. As I explained in that initial post, the rationale was to help the departments tasked with looking after the exposure of their digital assets by unifying search and monitoring capabilities so the task could be performed centrally rather than having the effort replicated over and over again by individual departments. Before this effort, there were hundreds of gov domains being manually monitored by separate departments across those governments - and thousands that weren't monitored at all.

Today, I'm welcoming the Austrian government on-board via their GovCERT department. They now have free access to perform on-demand searches of * (along with a handful of other Austrian gov domains on different TLDs) via API and enrol any of those domains for monitoring which sends them callbacks via a webhook model each and every time one of their email addresses appears in a data breach. I'm sharing this update in conjunction with GovCERT Austria as part of the commitment I made to transparency when on-boarding the first governments.

Willkommen GovCERT Austria!

Weekly Update 145

Weekly Update 145

Something totally new this week - Israel! I spent the week in Tel Aviv at Cyber Week, a massive infosec conference where I shared the keynote stage with an amazing array of speakers including many from three letter acronym departments and even PM Benjamin Netanyahu. It's funny how on the one hand an event like this can be so completely different to the very familiar NDC Oslo scene I was in just last week yet by the same token, I'm up there talking about all the same stuff and doing my usual thing.

This week, I'm talking about Israel, the Cyber Week event and how things are tracking with Project Svalbard (spoiler - bloody busy!) I also get a ticket from traffic cops for riding an electric scooter in a footpath so yeah, that's a new one for me...

Weekly Update 145
Weekly Update 145
Weekly Update 145


  1. I spent an afternoon in Jerusalem (link through to my Facebook pics, what an amazing place...)
  2. Plus, the better part of 4 days in Tel Aviv (posted more pics on the way to the airport at stupid o'clock this morning)
  3. TripAdvisor has been resetting a bunch of customers' passwords when found in a data breach (precisely what Scott and I were talking about last week in terms of many other companies proactively using breach data)
  4. strongDM is this week's blog sponsor (Use your SSO to grant or revoke access to any database, server, or k8s)

Weekly Update 144

Weekly Update 144

So first things first - my patience for the Instamics we're wearing just reached zero. One of them recorded and one of them didn't which means we've had to fallback to audio captured by the iPhone I was recording from so apologies it's sub-par. I ended up just uploading the unedited clip direct from the phone because frankly, after trying to recover the non-existent audio both my time and patience were well into the red.

Be that as it may, there's video, audio and a narrative to tell both around the NDC event Scott and I are at and the progress of "Project Svalbard". I'm trying to share as much as I can about that process as things progress and I hope people appreciate the transparency I've always run HIBP with. As I say in the video, if you've got questions about it then drop them in the comments section below.

Weekly Update 144
Weekly Update 144
Weekly Update 144


  1. Scott wrote about maintaining state in a Cloudflare worker (this is a fundamental part of how we're able to process 670M reports a day!)
  2. Check out how much HIBP trended in searches in January (yes, that's a direct map to my stress levels and yes, I will send stickers to anyone who creates that site I mentioned!)
  3. Project Svalbard is forging ahead (it's becoming increasingly demanding, but it's also a very exciting time)
  4. Varonis is sponsoring my blog again this week (check out their Varonis DFIR team investigating a cyberattack using their data-centric security stack)

Weekly Update 143

Weekly Update 143

Well this was a big one. The simple stuff first - I'm back in Norway running workshops and getting ready for my absolute favourite event of the year, NDC Oslo. I'm also talking about Scott's Hack Yourself First UK Tour where he'll be hitting up Manchester, London and Glasgow with public workshops. Tickets are still available at those and it'll be your last chance for a long time to do that event in the UK.

Then there's Project Svalbard. I think it'll come across in the video below, but putting a project I've poured my heart and soul into over the last 5 and half year up for sale is a massive thing for me. There are so many emotions involved at so many levels and I really wanted to try and get that across in a more personable form than what written word lends itself to. I hope I've done that, and I hope you enjoy listening to the back story of Project Svalbard. Here it is:

Weekly Update 143
Weekly Update 143
Weekly Update 143


  1. Scott's public Hack Yourself First UK Tour is coming up (Manchester, London and Glasgow - get on it!)
  2. Project Svalbard (the big one - this is a long weekly update mostly about my decision to move HIBP into another organisation)
  3. Twilio is sponsoring my blog this week (learn what regulations like PSD2 mean for your business, and how Twilio can help you achieve secure, compliant transactions)

Hack Yourself First – The UK Tour by Scott Helme

Hack Yourself First - The UK Tour by Scott Helme

It's the Hack Yourself First UK Tour! I've been tweeting a bit about this over recent times and had meant to write about it earlier, but I've been a little busy of late. Last year, I asked good friend and fellow security person Scott Helme to help me out running my Hack Yourself First workshops. I was overwhelmed with demand and he was getting sensational reviews for the TLS workshops he was already running. Since that time, Scott has run Hack Yourself First all over the world and done an absolutely sensational job of them. So, we decided to do a bunch in the UK and make them accessible to everyone:

  1. Manchester - 27th and 28th June
  2. London - 4th and 5th July
  3. Glasgow - 18th and 19th July

Tickets for the workshops are available at £1,250 + VAT for the 2 days which includes lunch and refreshments throughout. Scott has also arranged hotel packages in each location so if you need to stay over, there's one price you can send the boss that covers everything.

And finally, there's a shiny PDF flyer that includes all the details in one document:

Hack Yourself First - The UK Tour by Scott Helme

If you're in the UK (or can get to the UK), reach out to Scott on and he'd love to get you booked in for a couple of days of Hack Yourself First.

Project Svalbard: The Future of Have I Been Pwned

Project Svalbard: The Future of Have I Been Pwned

Back in 2013, I was beginning to get the sense that data breaches were becoming a big thing. The prevalence of them seemed to be really ramping up as was the impact they were having on those of us that found ourselves in them, myself included. Increasingly, I was writing about what I thought was a pretty fascinating segment of the infosec industry; password reuse across Gawker and Twitter resulting in a breach of the former sending Acai berry spam via the latter. Sony Pictures passwords being, well, precisely the kind of terrible passwords we expect people to use but hey, actually seeing them for yourself is still shocking. And while I'm on Sony, the prevalence with which their users applied the same password to their Yahoo! accounts (59% of common email addresses had exactly the same password).

Around this time the Adobe data breach happened and that got me really interested in this segment of the industry, not least because I was in there. Twice. Most significantly though, it contained 153M other people which was a massive incident, even by today’s standards. All of these things combined – the prevalence of breaches, the analysis I was doing and the scale of Adobe – got me thinking: I wonder how many people know? Do they realise they were breached? Do they realise how many times they were breached? And perhaps most importantly, have they changed their password (yes, almost always singular) across the other services they use? And so Have I Been Pwned was born.

I’ll save the history lesson for the years between then and today because there are presently 106 blog posts with the HIBP tag you can go and read if you’re interested, let me just talk briefly about where the service is at today. It has almost 8B breached records, there are nearly 3M people subscribed to notifications, I’ve emailed those folks about a breach 7M times, there are 120k people monitoring domains they’ve done 230k searches for and I’ve emailed them another 1.1M times. There are 150k unique visitors to the site on a normal day, 10M on an abnormal day, another couple of million API hits to the breach API and then 10M a day to Pwned Passwords. Except even that number is getting smashed these days:

Oh – and as I’ve written before, commercial subscribers that depend on HIBP to do everything from alert members of identity theft programs to enable infosec companies to provide services to their customers to protecting large online assets from credential stuffing attacks to preventing fraudulent financial transactions and on and on. And there are the governments around the world using it to protect their departments, the law enforcement agencies leveraging it for their investigations and all sorts of other use cases I never, ever saw coming (my legitimisation of HIBP post from last year has a heap of other examples). And to date, every line of code, every configuration and every breached record has been handled by me alone. There is no “HIBP team”, there’s one guy keeping the whole thing afloat.

When I wanted an infographic to explain the architecture, I sat there and built the whole thing myself by hand. I manually sourced every single logo of a pwned company, cropping it, resizing it and optimising it. Each and every disclosure to an organisation that didn't even know their data was out there fell to me (and trust me, that's massively time-consuming and has proven to be the single biggest bottleneck to loading new data). Every media interview, every support request and frankly, pretty much every single thing you could possibly conceive of was done by just one person in their spare time. This isn't just a workload issues either; I was becoming increasingly conscious of the fact that I was the single point of failure. And that needs to change.

It's Time to Grow Up

That was a long intro but I wanted to set the scene before I got to the point of this blog post: it’s time for HIBP to grow up. It’s time to go from that one guy doing what he can in his available time to a better-resourced and better-funded structure that's able to do way more than what I ever could on my own. To better understand why I’m writing this now, let me share an image from Google Analytics:

Project Svalbard: The Future of Have I Been Pwned

That graph is the 12 months to Jan 18 this year and the spike corresponds with the loading of the Collection #1 credential stuffing list. It also corresponds with the day I headed off to Europe for a couple of weeks of “business as usual” conferences, preceded by several days of hanging out with my 9-year old son and good friends in a log cabin in the Norwegian snow. I was being simultaneously bombarded by an unprecedented level of emails, tweets, phone calls and every other imaginable channel due to the huge attention HIBP was getting around the world, and also turning things off, sitting by a little fireplace in the snow and enjoying good drinks and good conversation. At that moment, I realised I was getting very close to burn-out. I was pretty confident I wasn’t actually burned out yet, but I also became aware I could see that point in the not too distant future if I didn’t make some important changes in my life. (I’d love to talk more about that in the future as there are some pretty significant lessons in there, but for now, I just want to set the context as to the timing and talk about what happens next.) All of this was going on at the same time as me travelling the world, speaking at events, running workshops and doing a gazillion other things just to keep life ticking along.

To be completely honest, it's been an enormously stressful year dealing with it all. The extra attention HIBP started getting in Jan never returned to 2018 levels, it just kept growing and growing. I made various changes to adjust to the workload, perhaps one of the most publicly obvious being a massive decline in engagement over social media, especially Twitter:

Project Svalbard: The Future of Have I Been Pwned

Up until (and including) December last year in that graph, I was tweeting an average of 1,141 times per month (for some reason, Twitter's export feature didn't include May and June 2017 and only half of July so I've dropped those months from the graph). From Feb to May this year, that number has dropped to 315 so I've backed off social to the tune of 72% since January. That may seem like a frivolous fact to focus on, but it's a quantifiable number that's directly attributable to the impact the growth of HIBP was having on my life. Same again if you look at my blog post cadence; I've religiously maintained my weekly update videos but have had to cut way back on all the other technical posts I've otherwise so loved writing over the last decade.

After I got home from that trip, I started having some casual conversations with a couple of organisations I thought might be interested in acquiring HIBP. These were chats with people I already knew in places I respected so it was a low-friction “put out the feelers” sort of situation. It’s not the first time I’d had discussions like this – I’d done this several times before in response to organisations reaching out and asking what my appetite for acquisition was like – but it was the first time since the overhead of managing the service had gone off the charts. There was genuine enthusiasm which is great, but I quickly realised that when it comes to discussions of this nature, I was in well over my head. Sure, I can handle billions of breached records and single-handedly run a massive online data breach services that’s been used by hundreds of millions of people, but this was a whole different ballgame. It was time to get help.

Project Svalbard

Back in April during a regular catchup with the folks at KPMG about some otherwise mundane financial stuff (I've met with advisers regularly as my own financial state became more complex), they suggested I have a chat with their Mergers and Acquisition (M&A) practice about finding a new home for HIBP. I was comfy doing that; we have a long relationship and they understand not just HIBP, but the broader spectrum of the cyber things I do day to day. It wasn't a hard decision to make - I needed help and they had the right experience and the right expertise.

In meeting with the M&A folks, it quickly became apparent how much support I really needed. The most significant thing that comes to mind is that I'd never really taken the time just to step back and look at what HIBP actually does. That might sound odd, but as it's grown organically over the years and I've built it out in response to a combination of what I think it should do and where the demand is, I've not taken the time to step back and look at the whole thing holistically. Nor have I taken enough time to look at what it could do; I'm going to talk more about that later in this post, but there's so much potential to do so much more and I really needed the support of people that specialise in finding the value in a business to help me see that.

One of the first tasks was to come up with a project name for the acquisition because apparently, that's what you do with these things. There were many horribly kitschy options and many others that leaned on overused infosec buzzwords, and then I had a thought: what's that massive repository of seeds up in the Arctic Circle? I'd seen references to it before and the idea of a huge vault stockpiling something valuable for the betterment of humanity started to really resonate. Turns out the place is called Svalbard and it looks like this:

Project Svalbard: The Future of Have I Been Pwned

Also turns out the place is part of Norway and all these things combined started to make it sound like a befitting name, beginning with the obvious analogy of storing a massive quantity of "units". There's a neat video from a few years ago which talks about the capacity being about a billion seeds; not quite as many records as are in HIBP, but you get the idea. Then there's the name: it's a bit weird and hard to pronounce for those not familiar with it (although this video helps), kinda like... pwned. And finally, Norway has a lot of significance for me being the first international talk I did almost 5 years ago to the day. I spoke in front of an overflowing room and as the audience exited, every single one of them dropped a green rating card into the box.

That was an absolute turning point in my career. It was also in Norway this January that HIBP went nuts as you saw in the earlier graph. It was there in that little log cabin in the snow that I realised it was time for HIBP to grow up. And by pure coincidence, I'm posting this today from Norway, back again for my 6th year in a row of NDC Oslo. So as you can see, Svalbard feels like a fitting name 🙂

My Commitments for the Future of HIBP

So what does it mean if HIBP is acquired by another company? In all honesty, I don't know precisely what that will look like so let me just candidly share my thoughts on it as they stand today and there are a few really important points I want to emphasise:

  1. Freely available consumer searches should remain freely available. The service became this successful because I made sure there were no barriers in the way for people searching their data and I absolutely, positively want that to remain the status quo. That's number 1 on the list here for a reason.
  2. I'll remain a part of HIBP. I fully intend to be part of the acquisition, that is some company gets me along with the project. HIBP's brand is intrinsically tied to mine and at present, it needs me to go along with it.
  3. I want to build out much, much more capabilities wise. There's a heap of things I want to do with HIBP which I simply couldn't do on my own. This is a project with enormous potential beyond what it's already achieved and I want to be the guy driving that forward.
  4. I want to reach a much larger audience than I do at present. The numbers are massive as they are, but it's still only a tiny slice of the online community that's learning of their exposure in data breaches.
  5. There's much more that can be done to change consumer behaviour. Credential stuffing, for example, is a massive problem right now and it only exists due to password reuse. I want HIBP to play a much bigger role in changing the behaviour of how people manage their online accounts.
  6. Organisations can benefit much more from HIBP. Following on from the previous point, the services people are using can do a much better job of protecting their customers from this form of attack and data from HIBP can (and for some organisations, already does) play a significant role in that.
  7. There should be more disclosure - and more data. I mentioned earlier how responsible disclosure was massively burdensome and Svalbard gives me the chance to fix that. There's a whole heap of organisations out there that don't know they've been breached simply because I haven't had the bandwidth to deal with it all.

In considering which organisations are best positioned to help me achieve this, there's a solid selection that are at the front of my mind. There's also a bunch that I have enormous respect for but are less well-equipped to help me achieve this. As the process plays out, I'll be working with KPMG to more clearly identify which organisations fit into the first category. As I'm sure you can imagine, there are some very serious discussions to be had: where HIBP would fit into the organisation, how they'd help me achieve those bullet-pointed objectives above and frankly, whether it's the right place for such a valuable service to go. There are also some major personal considerations for me including who I'd feel comfortable working with, the impact on travel and family and, of course, the financial side of the whole thing. I'll be honest - it's equal parts daunting and exciting.

Last week I began contacting each stakeholder that would have an interest in the outcome of Project Svalbard before making it public in this blog post. I explained the drivers behind it and the intention for this exercise to make HIBP not just more sustainable, but also for it to make a much bigger impact on the data breach landscape. This has already led to some really productive discussions with organisations that could help HIBP make a much more positive impact on the industry. There's been a lot of enthusiasm and support for this process which is reassuring.

One question I expect I'll get is "why don't I turn it into a more formal, commercially-centric structure and just hire people?" I've certainly had that opportunity for some time either by funding it myself or via the various VCs that have come knocking over the years. The main reason I decided not to go down that path is that it massively increases my responsibilities at a time where I really need to reduce the burden on me. As of today, I can't just switch off for a week and frankly, if I tried even for a day I'd be worried about missing something important. In time, building up a company myself might allow me to do that but only after investing a substantial amount of time (and money) which is just not something I want to do at this point.


I'm enormously excited about the potential of Project Svalbard. In those early discussions with other organisations, I'm already starting to see a pattern emerge around better managing the entire data breach ecosystem. Imagine a future where I'm able to source and process much more data, proactively reach out to impacted organisations, guide them through the process of handling the incident, ensure impacted individuals like you and me better understand our exposure (and what to do about it) and ultimately, reduce the impact of data breaches on organisations and consumers alike. And it goes much further than that too because there's a lot more that can be done post-breach, especially to tackle attacks such as the huge rate of credential stuffing we're seeing these days. I'm really happy with what HIBP has been able to do to date, but I've only scratched the surface of potential with it so far.

I've made this decision at a time where I have complete control of the process. I'm not under any duress (not beyond the high workload, that is) and I've got time to let the acquisition search play out organically and allow it to find the best possible match for the project. And as I've always done with HIBP, I'm proceeding with complete transparency by detailing that process here. I'm really conscious of the trust that people have put in me with this service and every single day I'm reminded of the responsibility that brings with it.

HIBP may only be less than 6 years old, but it’s the culmination of a life’s work. I still have these vivid memories stretching back to the mid-90's when I first started building software for the web and had a dream of creating something big; “Isn’t it amazing that I can sit here at home and write code that could have a real impact on the world one day”. I had a few false starts along the way and it took a combination of data breaches, cloud and an independent career that allowed me the opportunity to make HIBP what it is today, but it's finally what I'd always hoped I'd be able to do. Project Svalbard is the realisation of that dream and I'm enormously excited about the opportunities that will come as a result.

Weekly Update 142

Weekly Update 142

I made it to the Infosecurity hall of fame! Yesterday was an absolutely unreal experience that was enormously exciting:

But that wasn't all, there was also the European Security Blogger awards a couple of days earlier:

And just a general absolutely jam-packed, non-stop week for both Scott and I. We talk about what we've been up to in London, Scott's weird cert adventures and a couple of massive data breaches back home in Australia. I'm publishing this just before I head off to Oslo so I'll come from there next week solo, then with Scott again the week after from the NDC conference. Until then, here's this week's update:

Weekly Update 142
Weekly Update 142
Weekly Update 142


  1. Scott had a cert unexpectedly issued for one of his domains (interesting series of events that led to it, documented in that Twitter thread)
  2. Scott tweeted about a weird security decision by Emirate... and got into "Twitter trouble" (we only ever - ever - see this sort of behaviour online, never in person)
  3. Westpac's PayID was the target of a mass enumeration attack (apparently 100k Aussies had personal data exposed by this "feature")
  4. The Australian National University got seriously pwned (19 years worth of historical data - how much of that did they actually still need?)
  5. I'm sponsored by Varonis this week - watch their DFIR team investigate a cyberattack using their data-centric security stack

Weekly Update 141

Weekly Update 141

Another week, another conference. This time, Scott and I have just wrapped up the AusCERT event which is my local home town conference (I can literally see my house from Scott's balcony). We're talking about the event, upcoming ones, Scott's Hack Yourself First UK tour, some funky default values in EV certs and then we head off down a rabbit hole of 2FA and people getting fired for failing simulated phishing tests. Next one from London next week!

Weekly Update 141
Weekly Update 141
Weekly Update 141


  1. We've launched a bunch of hotel packages with the Hack Yourself First UK tour! (one price gets you access to the workshop and hotel accommodation in Manchester, London or Glasgow)
  2. Check out the forum for commentary on the default values in EV certs issue (it's an odd one, I'd still love to know how they got in there)
  3. People are actually getting fired for failing multiple simulated phishing attacks (I agree, this feels really dirty)
  4. Twilio is sponsoring my blog again this week (check out how easy it is to add 2FA to your app using Authy)

Weekly Update 140

Weekly Update 140

I'm a day and a half behind with this week's update again - sorry! Thursday and Friday were solid with training in Melbourne so I recorded Saturday and am pushing this out in the early hours of Sunday before going wakeboarding - is that work / life balance? But there's been a hell of a lot going on, particularly around HIBP and I'll be talking a lot more about that in the weeks to come.

For now, I did actually get a post out this week and also found myself in a rather unexpected debate about password managers, biometrics and "fun". I spend quite a bit of time this week talking about that, I'm curious to hear other people's thoughts on it too. Next week's update will be with Scott Helme again so if there's anything in particular you'd like to hear from him (us), drop me a note on it.

Weekly Update 140
Weekly Update 140
Weekly Update 140


  1. Last week's update had some really off the mark comments about biometrics and password managers (still not sure whether that was spam or organic comments)
  2. Pwned Passwords did 16M requests in a day with a 99.4% cache hit ratio! (I expect that ratio will only go up as demand increases)
  3. PayPal's cert hasn't been showing EV in Chrome since September (which perfectly demonstrates why EV doesn't work as advertised)

PayPal’s Beautiful Demonstration of Extended Validation FUD

PayPal's Beautiful Demonstration of Extended Validation FUD

Sometimes the discussion around extended validation certificates (EV) feels a little like flogging a dead horse. In fact, it was only September that I proposed EV certificates are already dead for all sorts of good reasons that have only been reinforced since that time. Yet somehow, the discussion does seem to come up time and again as it did following this recent tweet of mine:

Frankly, I think this is more a symptom of people coming to grips with the true meaning of SSL (or TLS) than it is anything changing with the way certs are actually issued, but I digress. The ensuing discussion after that tweet reminded me that I really must check back in on what I suspect may be the single most significant example of why EV has become little more than a useless gimmick today. It all started on stage at NDC Sydney in September, more than 8 months ago now. Here's the exact moment deep-linked in the recorded video:

Well that was unexpected. I came off stage afterwards and sat down with Scott Helme to delve into it further, whereupon we found behaviour that you can still see today at the time of writing. Here's PayPal in Firefox:

PayPal's Beautiful Demonstration of Extended Validation FUD

You can clearly see the green EV indicator next to the address bar in Firefox, but load it up in Chrome and, well...

PayPal's Beautiful Demonstration of Extended Validation FUD

Now, you may have actually spotted in the video that the cert was issued by "DigiCert SHA2 Extended Validation Server CA" which would imply EV. It also the same cert being issued to both Firefox and Chrome too, here's a look at it in both browsers (note that the serial number and validity periods match up):

PayPal's Beautiful Demonstration of Extended Validation FUD
PayPal's Beautiful Demonstration of Extended Validation FUD

The reason we're seeing the EV indicator in Firefox and not in Chrome has to do with the way the certificates chain in the respective browsers and again, here's Firefox then Chrome:

PayPal's Beautiful Demonstration of Extended Validation FUD
PayPal's Beautiful Demonstration of Extended Validation FUD

Whilst "DigiCert SHA2 Extended Validation Server CA" is the same in each browser, the upstream chain is then different with Firefox and Chrome both seeing different "DigiCert High Assurance EV Root CA" certs (even though they're named the same) and Chrome obviously then chaining up another couple of hops from there. But frankly, the technical explanation really isn't the point here, the point is that we're now nearly 8 months in which can only mean this:

PayPal really doesn't care that the world's most popular browser no longer displays the EV visual indicator.

And that's all EV ever really had going for it! (Note: yes, I know there can be regulatory requirements for EV in some jurisdictions, but let's not confuse that with it actually doing anything useful.) The entire value proposition put forward by the commercial CAs selling EV is that people will look for the indicator and trust the site so... it's pretty obvious that's not happening with PayPal.

Furthermore, as I've said many times before, for EV to work people have to change their behaviour when they don't see it! If someone stands up a PayPal phishing site, for example, EV is relying on people to say "ah, I was going to enter my PayPal credentials but I don't see EV therefore I won't". That's how EV "stops phishing" (according to those selling the certs), yet here we are with a site that used to have EV and if it ever worked then it was only by people knowing that PayPal should have it. So what does it signal now that it's no longer there? Clearly, that people aren't turning away due to its absence.

And finally, do you reckon PayPal is the sort of organisation that has the resources to go out and get another EV cert that would restore the visual indicator if need be? Of course they are! Have they? No, because it would be pointless anyway because nobody actually changes their behaviour in its absence!

It's a dead duck, let's move on.

Weekly Update 139

Weekly Update 139

Per the beginning of the video, it's out late, I'm jet lagged, all my clothes are dirty and I've had to raid the conference swag cupboard to even find a clean t-shirt. But be that as it may, I'm yet to miss one of these weekly vids in the 2 and a half years I've been doing them and I'm not going to start now! So with that very short intro done, here's this week's and I'll try and be a little more on the ball for the next one.

Weekly Update 139
Weekly Update 139
Weekly Update 139


  1. Google is having some issues with the U2F keys the recommend for their Advanced Protection Program (but seriously, this is a pretty minor issue)
  2. I'm definitely still recommending this approach for locking down Google accounts (that's my piece from November on how to get it all set up)
  3. Forbes had some Magecart script running on their site (interesting breakdown by @bad_packets)
  4. Let's Encrypt's CT log is now up and running (with support from Sectigo too so kudos to them for that, it's a very different approach to the old Comodo)
  5. I'm up for some European Blogger Awards again! (I'd love your votes folks 😎)
  6. Twilio is sponsoring my blog again this week (check how to implement 2FA in your app with Authy)