Cache Me If You Can

Jamieson O'Reilly

Nov 4, 2023

How a single Google search prevented the next Optus-style hack

TLDR: How we found an API (not Optus) that was exposing sensitive data of over 500,000 Australians


It's 6am. I don't even remember what day it is, but it's one or two days after the Optus breach was announced in late September.

There I was, squinting through the early morning sun, trying to see my computer screen where there was 48 hours' worth of Chrome tabs, nearly putting my 128GB RAM PC in its grave.

My wife reminds me of the time and says to go to sleep, but I'm so close to finishing the post on my initial analysis of the Optus breach I gotta see it through.

All I need to finish the post is a crisp picture of a Western Australian drivers licence so I can photoshop it onto a catchy image for my post.

I have OCD when it comes to posting content, so there was no way I was settling on obvious fake ID pictures with giant red text for my post. It just looked too sloppy.

Sleep deprived and unable to tell the difference between my URL bar and the Google Image search box. I somehow ended up pasting ".com.au" into my Google image search along with my previous search term.

I immediately knew something was terribly wrong…

Google wasn't just showing me John and Jane Doe.

I was seeing large numbers of what appeared to be legitimate driver's licences and other ID documents from all states (NSW, Victoria, WA and more).

This wasn't just another Optus. This was Optus 2.0 with a cherry-on-top.

I didn't even have to try to extract data from an open API (as I believe happened with Optus).

None of that was necessary, as this data was so "out there" that even Google's bots had found and cached it.

Note: We have purposefully redacted the specific Google Image search used, to prevent unnecessary exposure of ID documents.

So many questions

What the hell was going on?

It wasn't just one site leaking data.

Multiple Australian websites were spilling their secrets onto the internet for anyone to see.

These websites appeared to be Australian job seeker websites, and the ID documents seemed to belong to job applicants who had uploaded their identification during the hiring process.

How did this happen? (Short version)

Due to the technical aspects related to the vulnerabilities causing these exposures, we've broken this up into two versions.

The short version is as follows:

  • Multiple Australian job sites stored drivers' licences, passports, Medicare cards and more on exposed, unauthenticated APIs without requiring authentication.

  • Google found these open APIs and began crawling, leading to a large number (but not all) of the available ID documents being cached on Google Images.

  • There were potentially up to 518,456 private documents uploaded by job seeking candidates, which were available to any external attacker armed only with a web-browser.

  • At least one of the websites was actively warning users not to upload their licences as profile pictures, indicating the site owner(s) may have known about the risks.

  • A separate, unrelated vulnerability existed that would have allowed the theft of any job candidate's private documents which were not cached by Google, including but not limited to:

    • Australian birth certificates

    • Australian citizenship certificates

    • Australian passports

    • Evidence of permanent residence status

    • Valid visa with permission to work

    • Covid Vaccination Certificates

    • Covid Test Results

    • and others.

How did this happen? (Long version)

After identifying these exposures on Google Images, Dvuln proceeded to examine the structure of the API to understand the root cause of said exposures.

Dvuln believes two main causes led to the previous and current exposure(s) of ID documents, as detailed below.

Issue # 1 - Failure to clean-up prior exposures

To understand how ID documents were uploaded to the vulnerable API server, Dvuln uploaded its own ID documents for testing purposes.

When uploading profile pictures, users are shown the following message.

This tells us that the website administrators may have been aware of previous users uploading their ID documents and are actively attempting to prevent further exposures.

However, if this is the case, it begs the question of why many ID documents remain available via the open API server and in Google Images' cache, when they should and could have been removed from the API and from Google Images.

Google has detailed instructions describing how to request image removal from Google Images, and can be found at https://support.google.com/websearch/answer/4628134?hl=en.

To make matters worse, although some of the exposed licences on Google Images appeared to be expired, most of these ID documents were still valid, accessible and usable by fraudsters.

Issue # 2 - Unauthenticated API Allowing for theft of private job candidate documents (Second-order IDOR)

As illustrated earlier in this post, the site clearly warns users not to upload their personal documents, such as drivers licences, when updating their profile pictures.

For more sensitive documents, the site kindly provides a “Documents” section where it allows job seekers to upload the following document types:

When a user uploads a scanned document such as a passport or birth certificate to the server, this is seemingly saved securely, as only authenticated users can upload documents.

Sadly, the same security did not apply to users wanting to download the documents, as after uploading these documents, they were (before the issue was patched) easily accessible to any attacker without requiring authentication.

Using two separate, yet simple unauthenticated API requests, an attacker could have downloaded any secure document uploaded to the Documents section of the job seeker site as follows.

REQUEST 1 - Generating a UUID for the target document

The first API call required to download a secure document from the Job Seeker API expected a parameter named scanId, which was in the form of a number value as per the following example.

https://www.████.com.au/api/download/document?scanId=518460

To test this functionality Dvuln uploaded two of its own private documents, which were attributed scanId values 518456 and 518460, respectively.

Once the first request had been completed, the API server responded with one-time-use UUID value as follows:

<ServiceResultViewModelOfstring xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/Hippo.Objects.ViewModels"> 
<Attributes/> <Data>e9ee2034-7fdf-40cb-9dcc-5552a320cc39</Data>
<Errors xmlns:d2p1="http://schemas.microsoft.com/2003/10/Serialization/Arrays"/>
<ModelErrors xmlns:d2p1="http://schemas.microsoft.com/2003/10/Serialization/Arrays"/> 
<ResultCode>Success</ResultCode> <ResultMessage>The command completed successfully.</ResultMessage> 
</ServiceResultViewModelOfstring>

REQUEST 2 - Using the UUID to download/steal the private document

At this stage, an attacker must use the Universal Unique Identifier (UUID) value obtained from the initial API request.

Using a second HTTP request, an attacker would have been able to download not only document ID 518460 but all of the potential 518,459 prior documents that had been uploaded to the server as follows.

https://www.████.com.au/api/download/document/e9ee2034-7fdf-40cb-9dcc-5552a320cc39

In this example the value e9ee2034-7fdf-40cb-9dcc-5552a320cc39 was the value returned by the first unauthenticated API request.

Below is an example of the document 518460.png, which Dvuln was able to download by issuing two API requests without any authentication required.

It is unclear whether the multi-phased API request logic was intended as a security mechanism or not.

If this was intended to be an API security feature it was completely and utterly flawed.

Whether an attacker needed to send 2 or even 20 requests for every single file they planned to steal would not have provided any meaningful security benefits to the API as an attacker could have easily automated such logic and downloaded the entire data set in a short period.

Had this been found by anyone else?

Technically, this question can be answered only if certain access logs are reviewed.

During our analysis, there was no available evidence we could identify to determine if this API had been exploited in the wild.

Having said this, we believe Australia got lucky with the Optus breach because the attacker was noisy.

For serious, organised criminal groups the last thing they would want to do is tell the world they had stolen all of the IDs from an open API and risk those ID documents being cancelled, or worse, risk the attack being traced back to the hackers.

Helping the company avoid a telco-like data breach

One of the most interesting aspects of this whole process was that the vulnerability which allowed access to over 500,000 private documents may be the same class of vulnerability that was used in the recent Optus hack.

However, such details are currently unconfirmed until Optus releases more updates surrounding the hack.

As described by the individual involved in the Optus hack @optusdata, the Optus API was exploited by enumerating a "contact ID" value.

This type of behaviour is suggestive of a specific type of vulnerability known as an indirect object reference or "IDOR" vulnerability, and is commonly found to impact API endpoints by enumerating parameter values such as customer ID, file ID etc to access objects from a database or data-source that should not be accessible to an attacker.

The good news is, within 2 hours of sending an e-mail to the public inbox for this website, we received a phone call from a represenatative who managed this service.

Just one hour after speaking with the representative, we were enagaged with the organisations development team who was able to roll out an immediate fix.

This was done by ensuring requests made to the /candidate/api/download/scan/ API endpoint required authentication.

Disclaimer: Dvuln has not performed any type of security testing on the impacted application and thus, additional, undiscovered vulnerabilities may still exist.

Secure. By. Design

Give your users the security they deserve